I want to make an offline sentence dictionary database in my localhost from
tatoeba.org
dump files in the link tatoebaarchive. As you can notice from the codes, sentences.csv
contains language ids
such as eng
, fra
, cmn
, tur
and the sentences in that language; and links.csv
maps sentences ids
to translated parallel sentences.
I want to search a word in a language, say
eng
, and list pairs of searched word's sentences and translated sentences of it. For example I search "beauty" in English (eng) and list them with its parallel sentences in French (fra).`sentences.csv` has three columns: `id` `language` `text`
`links.csv` has two columns: `sentenceId` `translatedId`
I can make a parallel sentence in three steps.
1)
SELECT id FROM sentences WHERE text LIKE('%she is a rare beauty_%') AND language='eng'
+-------+
| id |
+-------+
| 21687 |
+-------+
2)
SELECT translatedId FROM links WHERE sentenceId='21687'
+--------------+
| translatedId |
+--------------+
| 184559 |
| 517365 |
| 550067 |
| 2238371 |
| 2238372 |
+--------------+
3)
SELECT text FROM sentences WHERE (id ='550067' AND language='fra') OR id ='21687'
+------------------------------------------+
| text |
+------------------------------------------+
| It is true she is a rare beauty. |
| C'est vrai, elle est d'une rare beauté. |
+------------------------------------------+
How can I merge three queries into one liner query to get the third step result?
Not completely what I wanted but this serves the purpose
SELECT `sentences`.* FROM
`sentences` JOIN
`links` ON `id` = `translatedId`
WHERE `sentenceId` = (SELECT id FROM sentences WHERE text LIKE('%she is a rare beauty_%') AND language='eng' LIMIT 1);
The result is
+---------+----------+-------------------------------------------------+
| id | language | text |
+---------+----------+-------------------------------------------------+
| 184559 | jpn | 確かに彼女は絶世の美人です。 |
| 517365 | deu | Es ist wahr, sie ist eine seltene Schönheit. |
| 550067 | fra | C'est vrai, elle est d'une rare beauté. |
| 2238371 | ber | S tidet, drusit tsednan ay icebḥen am nettat. |
| 2238372 | ber | Ccbaḥa-nnes drus tin ay tt-yesɛan. |
+---------+----------+-------------------------------------------------+
LIKE
searches? Your post is very, very unclear... – ılǝ Oct 9 '13 at 7:21how to make a list parallel sentences that a particular word exist
- your query is correct. Sorry - it remains unclear if you are asking how to optimize the speed of the query, or how to do a join on the tables, or how to deal with special characters or, or, or – ılǝ Oct 9 '13 at 7:41