How to parse this format (Praat TextGrid)

TextGrid is a segmentation file used by Praat. I would like to write a parser that will then check the data. My question is:

How could you write a parser for this format? Read this one by one or something else? Is this a known format?

File type = "ooTextFile" Object class = "TextGrid" xmin = 0 xmax = 93.0538775510204 tiers? <exists> size = 3 item []: item [1]: class = "IntervalTier" name = "diph" xmin = 0 xmax = 93.0538775510204 intervals: size = 65 intervals [1]: xmin = 0 xmax = 1.300090702947846 text = "" intervals [2]: xmin = 1.300090702947846 xmax = 1.5300845864661654 text = "ey_s" intervals [3]: xmin = 1.5300845864661654 xmax = 3.4648692624493815 text = "" 

(This is repeated by EOF at intervals [4 .... n])

+4
source share
3 answers

The TextGrid parser already exists, and it is part of the NLTK toolkit. The Python file is here:

http://nltk.googlecode.com/svn/trunk/nltk_contrib/nltk_contrib/textgrid.py

Updated: https://github.com/nltk/nltk_contrib/blob/master/nltk_contrib/textgrid.py

+11
source

Automatic Praat TextGrid File Parser is a small application for parsing Praat textGrid files. The result of the parsing is a spreadsheet, which is saved in the output text file. The output text file can be imported by applications such as Excel. TGP is designed for a flexible program that can be constantly expanded or changed easily, currently it is able to analyze certain types of TextGrid files. Version 1.0 of TGP reads TextGrid files with the following types of elements: word, phone, and optional focus.

http://tgp.peremila.com/

+1
source

An alternative solution is to work with the JSON or YAML representations of these Praat objects; then the analysis of correctness is trivial.

I wrote two scripts for Perl to facilitate just that (for converting from Praat to JSON / YAML and for converting from YAML / JSON to Praat ), which can be useful for these tasks.

Scripting is part of a plugin that I support, called serialise , which is distributed through CPrAN . The implementation is a bit hacky, but it is pretty stable, and the plugin supports most of the objects you want to use. All comments are welcome.

+1
source

All Articles