Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- > [Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8
- > perl ..\cmd\utf8-tokenize.perl -a ..\lib\german-spoken-abbreviations .\input.txt
- <file name="foo" mode="oral">
- °h
- der
- mädsch
- (
- .
- )
- mädchen
- brauch
- ähm
- (
- 0.5
- )
- wie
- sag
- man
- d
- auf
- DEUTSCH
- (
- --
- )
- <
- <creaky>
- dinero
- >
- </file>
- <file name="bar" mode="oral">
- er
- habt
- gesacht
- (
- dass|das
- )
- GELDbeutel
- (
- .
- )
- geld
- genehmt
- </file>
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement