I want to make an offline sentence dictionary database in my localhost from tatoeba.org dump files in the link tatoebaarchive. As you can notice from the codes, sentences.csv contains language idssuch as engfracmntur and the sentences in that language; and links.csv maps sentences ids to translated parallel sentences.
I want to search a word in a language, say eng, and list pairs of searched word's sentences and translated sentences of it. For example I search "beauty" in English (eng) and list them with its parallel sentences in French (fra).
`sentences.csv` has three columns: `id` `language`  `text`
`links.csv` has two columns: `sentenceId`   `translatedId`
I can make a parallel sentence in three steps.
1) SELECT id FROM sentences WHERE text LIKE('%she is a rare beauty_%') AND language='eng'
       +-------+
       | id    |
       +-------+
       | 21687 |
       +-------+
2) SELECT translatedId FROM links WHERE sentenceId='21687'
      +--------------+
      | translatedId |
      +--------------+
      |       184559 |
      |       517365 |
      |       550067 |
      |      2238371 |
      |      2238372 |
      +--------------+
3) SELECT text FROM sentences WHERE (id ='550067' AND language='fra') OR id ='21687'
      +------------------------------------------+
      | text                                     |
      +------------------------------------------+
      | It is true she is a rare beauty.         |
      | C'est vrai, elle est d'une rare beauté.  |
      +------------------------------------------+
How can I merge three queries into one liner query to get the third step result?
  • Sorry - what exactly is the question? How to import the data to MySQL tables? How to optimize the query speed? How to perform LIKE searches? Your post is very, very unclear... – ılǝ Oct 9 '13 at 7:21
  • Thanks for responding. Core of my question is how to make a query to list parallel sentences that a particular word exist. Like 'It is true she is a rare beauty. > C'est vrai, elle est d'une rare beauté.' – kenn Oct 9 '13 at 7:36
  • how to make a list parallel sentences that a particular word exist - your query is correct. Sorry - it remains unclear if you are asking how to optimize the speed of the query, or how to do a join on the tables, or how to deal with special characters or, or, or – ılǝ Oct 9 '13 at 7:41
  • Example of jukuu.com bilingual sentence searcher exatcly what I want to do. How can I achieve it with sql queries on sentences.csv database? – kenn Oct 9 '13 at 15:51
  • Clarifying my question,I must say it's confusing and it requires subqueries besides my English is not perfect. I want to query a word (say "good") in sentence table from "text" column that contains "eng" language ; and using each found sentence's "id" to make another query from links table, so it will give me "translatedId"s of found sentences and finally using "translatedId"s to make another query in sentences table from "id" column that contains "fra" language. It will list English sentences that includes "good" word with their translated counterparts in French. – kenn Oct 11 '13 at 9:21
  SELECT `sentences`.* FROM 
  `sentences` JOIN 
  `links` ON `id` = `translatedId` 
  WHERE `sentenceId` = (SELECT id FROM sentences WHERE text LIKE('%she is a rare beauty_%') AND language='eng' LIMIT 1);
The result is
  +---------+----------+-------------------------------------------------+
  | id      | language | text                                            |
  +---------+----------+-------------------------------------------------+
  |  184559 | jpn      | 確かに彼女は絶世の美人です。                    |
  |  517365 | deu      | Es ist wahr, sie ist eine seltene Schönheit.    |
  |  550067 | fra      | C'est vrai, elle est d'une rare beauté.         |
  | 2238371 | ber      | S tidet, drusit tsednan ay iceben am nettat.   |
  | 2238372 | ber      | Ccbaa-nnes drus tin ay tt-yesɛan.              |
  +---------+----------+-------------------------------------------------+