Unknown audio format - where to dig?

On my Android phone (Philips Xenium W632) I have the ability to record calls using my own software (without any external programs, this option is enabled in the service menu). The problem is that the final format is unknown for any player I tried. Looking at the contents, the files do not seem to be packed or encrypted, so it would probably not be so difficult to recognize the format for someone familiar with this field. I believe this question is probably too specific to be asked on SO - but I just don't know the active forums where such people communicate. Therefore, I would appreciate any hints of such resources. Tips for a more proper title / tags for this question will also be appreciated.

Technical information: files have names like "Mon_Apr_2013__10_48_56.vm", it all starts with the header 0x66 0xAA and has about 7.9 kb per second of recording. I can of course provide sample files.

UPD 1) I have posted the following files here: 10 second recording, 133 kB ; 122 second recording, 975 kB

2) Suppose that it was a “Samsung VoiceMemo file”, I first tried the QualComm PureVoice converter , then the PureVoice application, then the Samsung PC Studio Version 7.2.24.9 - everything failed.

3) Tried MediaInfo (thanks @Jan for the suggestion) - he could not recognize the files.

+4
source share
5 answers

Try MediaInfo . He can tell you almost any codec and format for audio and video files on the planet. If this does not work, upload the file somewhere and I can see.

0
source

Try FFmpeg ( ffmpeg -i file ) or MPlayer ( mplayer -identify ). If you need to overwrite files, see if you can record it using the headphone jack. Thus, the loss of quality is minimal.

0
source

I tried a bunch of tools, the only thing that produced something that didn't sound like a buzz was when I tried to convert the file using the LPC or LPC10 encoding through "sox".

Of course, it really cannot be called a “voice recording”, as it sounds like muffled murmurs.

Here is what I did:

 mv Sun_Apr_2013__18_11_58.vm Sun_Apr_2013__18_11_58.lpc sox Sun_Apr_2013__18_11_58.lpc Sun_Apr_2013__18_11_58.wav mv Sun_Apr_2013__18_11_58.vm Sun_Apr_2013__18_11_58.lpc10 sox Sun_Apr_2013__18_11_58.lpc10 Sun_Apr_2013__18_11_58.wav 

Sox is a kind of brute force approach, but like anyone else, the analog cable method may be your best bet.

0
source

Crazy long shot: if the source data (133 kB / 10 secs looks a bit like an uncompressed 16-bit mono IIRC), then you can try to write a program that reads some data and draws a graph. If it looks pretty pretty crooked, then it's a matter of converting this program to output a sound file, not to draw a curve. I am almost sure that for this there are libraries for different programming languages.

0
source

This is a kind of difficult case. I researched several things that can help in identifying a file. Spoiler: I also cannot play the file correctly.

File header

The file header is 66 A2 C2 00 in hexadecimal format. There is nothing on the network through this route.

Linux file tool

Running file on any of the samples does not produce meaningful results.

 $ file *.vm Sun_Apr_2013__18_11_58.vm: data Sun_Apr_2013__18_23_11.vm: data 

File structure research

Perhaps the most interesting results were obtained when viewing the hexadecimal dump files. Here is an excerpt from a random location in a smaller file:

 0001-ea10: 12 02 14 00-70 00 00 00-43 45 15 75-e4 51 00 04 ....p... CE.uQ. 0001-ea20: 00 00 cc 00-0b 0b 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ea30: 3f 00 3f 00-10 27 00 00-00 00 00 00-00 00 00 00 ?.?..'.. ........ 0001-ea40: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ea50: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ea60: 00 00 00 00-00 00 07 20-68 5f 6b b7-7c 84 07 00 ........ h_k.|... 0001-ea70: 0f cf 00 74-14 a1 22 44-4c 9f a7 34-80 bc ce f0 ...t.."D L..4.... 0001-ea80: 21 07 ae 87-4e 6f 00 00-16 7a eb cd-c5 47 42 26 !...No.. .z...GB& 0001-ea90: 73 08 04 de-60 85 8d de-15 a4 85 10-c0 fe 1a 8f s...`... ........ 0001-eaa0: 35 32 f8 c6-bb 5f 0a 00-34 f0 e9 a9-35 a8 9f f8 52..._.. 4...5... 0001-eab0: 44 81 5c 24-3f 11 97 52-cb 1a 64 86-21 14 5d d9 D.\$?..R ..d.!.]. 0001-eac0: 93 b1 1a 32-ad 49 07 00-66 aa c2 00-84 3a 91 00 ...2.I.. f....:.. 0001-ead0: 2b 05 12 02-14 00 70 00-00 00 43 45-15 75 e4 51 +.....p. ..CE.uQ 0001-eae0: 00 08 00 00-cc 00 0b 0b-00 00 00 00-00 00 00 00 ........ ........ 0001-eaf0: 00 00 3f 00-3f 00 10 27-00 00 00 00-00 00 00 00 ..?.?..' ........ 0001-eb00: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-eb10: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-eb20: 00 00 00 00-00 00 00 00-07 20 55 04-7a 33 8c 28 ........ ..U.z3.( 0001-eb30: 11 c0 3c 0f-00 df 20 75-5e 05 73 61-8e 67 07 4c ..<....u ^.sa.gL 0001-eb40: b1 82 41 52-f5 54 51 0a-00 00 aa 20-2f 6c 9f 04 ..AR.TQ. ..../l.. 0001-eb50: f7 59 14 11-15 c5 08 2d-d9 f4 aa 64-19 65 3c 9d .Y.....- ...de<. 0001-eb60: a2 80 32 38-16 0c a2 2e-01 00 34 f0-e9 a9 35 a8 ..28.... ..4...5. 0001-eb70: 9f f8 44 81-5c 24 3f 11-97 52 cb 1a-64 86 21 14 ..D.\$?. .R..d.!. 0001-eb80: 5d d9 93 b1-1a 32 ad 49-07 00 66 aa-c2 00 89 3a ]....2.I ..f....: 0001-eb90: 91 00 2b 05-12 02 14 00-70 00 00 00-43 45 15 75 ..+..... p...CE.u 0001-eba0: e4 51 00 0c-00 00 cc 00-0b 0b 00 00-00 00 00 00 .Q...... ........ 0001-ebb0: 00 00 00 00-3f 00 3f 00-10 27 00 00-00 00 00 00 ....?.?. .'...... 0001-ebc0: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ebd0: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ebe0: 00 00 00 00-00 00 00 00-00 00 07 20-2d 7a 37 35 ........ ....-z75 0001-ebf0: 70 92 88 88-07 68 00 17-d0 43 0c d3-f2 c9 49 1c p....h.. .C....I. 0001-ec00: 42 bd 57 70-7a fc 41 e0-67 cb 00 00-b4 5e 76 0c B.Wpz.A. g....^v. 0001-ec10: fd 23 74 31-19 bc 3b 1e-9e a8 86 29-cc 81 24 0e .#t1..;. ...)..$. 0001-ec20: d4 3a c2 9b-18 40 6b da-3a 2a 02 00-34 f0 e9 a9 .: ...@k. :*..4... 0001-ec30: 35 a8 9f f8-44 81 5c 24-3f 11 97 52-cb 1a 64 86 5...D.\$ ?..R..d. 0001-ec40: 21 14 5d d9-93 b1 1a 32-ad 49 07 00-66 aa c2 00 !.]....2 .I..f... 0001-ec50: 8d 3a 91 00-2b 05 12 02-14 00 70 00-00 00 43 45 .:..+... ..p...CE 0001-ec60: 15 75 e4 51-00 10 00 00-cc 00 0b 0b-00 00 00 00 .uQ... ........ 0001-ec70: 00 00 00 00-00 00 3f 00-3f 00 10 27-00 00 00 00 ......?. ?..'.... 0001-ec80: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ec90: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-eca0: 00 00 00 00-00 00 00 00-00 00 00 00-07 20 71 15 ........ ......q. 0001-ecb0: 35 b5 74 80-00 80 51 3b-80 7f 3a 0f-e0 19 6e 2d 5.t...Q; ..:...n- 0001-ecc0: 0a 03 e3 80-7d 5a a8 fb-0a 0d fa 66-00 00 8e 28 ....}Z.. ...f...( 0001-ecd0: d6 cd df 07-64 07 dd 89-3b af 08 0a-61 06 11 98 ....d... ;...a... 0001-ece0: 04 78 1a 82-7f 4d 7a 08-cf 6a e9 7c-0c 00 34 f0 .x...Mz. .j.|..4. 0001-ecf0: e9 a9 35 a8-9f f8 44 81-5c 24 3f 11-97 52 cb 1a ..5...D. \$?..R.. 0001-ed00: 64 86 21 14-5d d9 93 b1-1a 32 ad 49-07 00 66 aa d.!.]... .2.I..f. 0001-ed10: c2 00 91 3a-91 00 2b 05-12 02 14 00-70 00 00 00 ...:..+. ....p... 0001-ed20: 43 45 15 75-e4 51 00 14-00 00 cc 00-0b 0b 00 00 CE.uQ. ........ 0001-ed30: 00 00 00 00-00 00 00 00-3f 00 3f 00-10 27 00 00 ........ ?.?..'.. 0001-ed40: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ed50: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ed60: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 07 20 ........ ........ 0001-ed70: 28 71 63 90-c9 2a 13 40-1f 6a 80 97-88 b6 61 82 (qc..* .@ .j....a. 0001-ed80: 8e 95 41 67-78 8a d0 46-50 d0 74 06-1a b8 00 00 ..Agx..F Pt.... 0001-ed90: 14 0e e3 29-2d 09 87 a7-52 17 13 19-b0 80 da b0 ...)-... R....... 0001-eda0: 02 4c 39 e9-03 d2 30 95-7a b2 0b 12-0e 7b 0a 00 .L9...0. z....{.. 0001-edb0: 34 f0 e9 a9-35 a8 9f f8-44 81 5c 24-3f 11 97 52 4...5... D.\$?..R 0001-edc0: cb 1a 64 86-21 14 5d d9-93 b1 1a 32-ad 49 07 00 ..d.!.]. ...2.I.. 0001-edd0: 66 aa c2 00-96 3a 91 00-2b 05 12 02-14 00 70 00 f....:.. +.....p. 0001-ede0: 00 00 43 45-15 75 e4 51-00 18 00 00-cc 00 0b 0b ..CE.uQ ........ 0001-edf0: 00 00 00 00-00 00 00 00-00 00 3f 00-3f 00 10 27 ........ ..?.?..' 0001-ee00: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ee10: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 0001-ee20: 00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00 ........ ........ 

Take a look, for example. for a CE pattern repeating at regular intervals, which changes throughout the file. Some observed intervals are 159 and 192 bytes. The CE mark is not always present in the entire file; sometimes it manifests as cE or otherwise. Obviously, in the whole file there are more or less frames of constant length.

Playing a file as PCM

A fairly constant relationship of playback time / time suggests a simple coding scheme. The simplest coding scheme used will be PCM — one sample, unchanged, per sample interval.

If you make the file play as a pulse-modulated file (i.e. a wav file in the Windows world) using a Linux command, for example

 aplay -c 2 -f S16_LE Sun_Apr_2013__18_11_58.vm 

you hear different phases of different, fairly uniform noises. They are probably caused by the different frame lengths discussed above. However, there is no indication of speech or anything like that that would be expected if it were just a Bindian / dry mix. This suggests that a more advanced coding scheme is used here.

Documentation

Accessible documentation provides tips on using the AMR codec. However, it says that it is used for voice recording (which should probably be read as a function of dictation). In any documents in English that I saw, there are no references to recording calls, and not in a quick Google search. Call recording is probably regional support.

Conclusion

The presence of a large number of zeros indicates that this format has not been optimized for size. Regular AMR files do not contain such areas of consecutive zeros.

The presence of variable-length frames in binary data is a sign of an extended format. When combined with zeros, it also more or less eliminates any encrypted payload (as does obfuscation of zero areas).

The relatively constant ratio of audio length to file size suggests that this is a relatively simple format. However, the absence of any audible artifacts of real recording when playing PCM indicates a more complex format.

It is noteworthy that Philips also produces voice recorders that use their own .dss file format. They are touted as being optimized for small file sizes - something that doesn't apply to these files.

Thus, I am ready to bet that it is an AMR-encoded file with a non-standard header and, possibly, a non-standard file format as a whole.

How to continue the study

  • Look for other people in the region (and language) where you bought this phone. As you can see, the call recording feature is not available in the US and UK models.
  • Email Philips to ask them about the format and how to play it on a PC. This may be the easiest route.
  • Examine individual frames to identify any similarities between what you have and what frames should look like in AMR or similar codecs.
  • just record messages using an analog link and save a lot of time :)
0
source

All Articles