parsing - text file of all titles / topic titles in Freebase -


i need text file contain every title / title of each topic / title of each item in .txt file each on own line.

how can or make if have downloaded freebase rdf dump?

if possible, need separate text file each topic's / item's description on single line each description on own line.

how can that?

i appreciate if me make either of these files freebase rdf dump.

thanks in advance!

filter rdf dump on predicate/property ns:type.object.name. if want particular language, filter language e.g. @en.

edit: missed second part descriptions being desired well. here's 3 part regex lines with:

  1. english names
  2. english descriptions
  3. a type of /commmon/topic

combining 3 left exercise reader.

zegrep $'\tns:(((type\\.object\\.name|common\\.topic\\.description)\t.*@en)|type\\.object\\.type\tns:common\\.topic)\\.$' freebase-rdf-2013-06-30-00-00.gz | gzip > freebase-rdf-2013-06-30-00-00-names-descriptions.gz 

it seems have performance issue i'll have at. simple grep of entire file takes ~11 min on laptop, has been running several times that. i'll have @ later though...


Comments

Popular posts from this blog

c# - Send Image in Json : 400 Bad request -

jquery - Fancybox - apply a function to several elements -

An easy way to program an Android keyboard layout app -