[i18n] Categories for non-English articles?

Luther, Ron Ron.Luther at hp.com
Wed Dec 17 07:46:40 CST 2008

Drs M. Feenstra, MALD, MBA noted:

>>Yesterday, when I submitted the Dutch translation of an article
>>that had already been published on evolt.org,

Hi Marcel,

Excellent!  Thank you for the effort.

>>However, there may be better ways to "label" non-English articles:

Absolutely.  I think that's a very good place for us to start!

I think I might prefer a modification of your 'option 2' [>>2) Use that new category *in addition to* the existing categories].  {Unfortunately, I do not have this 100% worked out ... So I will need some help here because I'm wondering if we may need a slight tweak to our backend architecture in order to accomplish what I'm after.}

Now, like everyone else, I like to get into and play with the architectural implications and start designing the solution before fully defining the problem ... So lets do that!

What I'm thinking is that we have 3 fields;
(a) The current 'category' field as it is.
(b) (And I believe evolt is leaning in this direction anyway.) Implementing a new many-to-one 'tagging' system as seen in slashdot and elsewhere.  That would allow this particular article to be tagged in both languages; "Commentary & Society", "Jobs", "Nederlands", "Commentaar & Maatschappij" and "Banen", etc. so it could be found in any number of future searches.
(c) Adding a new 'language' field.

Where I'm a little fuzzy is whether this means that we need to create and maintain a hierarchical table in the backend in order to roll these tags up into discernable 'categories' or not.  [Maybe all we need is a simple, build-as-you-go translations table --- Our first record could say: for language = 'Dutch' and category = 'Jobs' display 'Banen'?  We could build out additional records as we acquire new articles.]

Now lets go back and look at potential requirements:

Let's imagine for a moment that we have a future evolt where we have 250 articles published on CSS-3 ... 50 articles in each of 5 different languages including at least one RTL (Hebrew perhaps) and at least one language that requires a non-Latin-1 character set (maybe some mainland Chinese in a Big5 character set from a Hong Kong evolter).

Here are the functionalities I would like to enable:

(a) I'd like an [x-language] only reader to be able to restrict the search results or display page to [x-language] only articles and information.  It does me no good to even see that we have articles available in katakana or urdu if I can't read those character sets or languages.

(b) I'd like someone who reads many languages to be able to search a topic for all articles in all languages.  Sometimes it might be fun to see cyrillic and Big5 character set articles mixed in with the others on a particular topic.  {How much more of a world-wide resource would we become if we merely announced the existence of the browser archive in a larger number of short articles in different non-Western languages?}

[I'm thinking there may be some interesting work here in order to display the search result short summaries in RTL language articles intermingled with LTR language articles ... but I haven't worked with this so I don't know for sure.]

(c) I want to let someone select 5 or so languages and restrict search results to everything in those languages.  Outside the US there are lots of people who have more than a passing familiarity with multiple languages.  It would be really sweet to let someone pick 'English, 'French', 'Dutch', and 'German' and be able to show them all results restricted to that _set_ of languages.

(d) Ideally, I might want to allow a reader to set a single or multi-language preference in a cookie or a logged-in profile and automagically restrict results to that set.  [And yes, naturally, we would give them a manual option to override any preset preferences on an individual search basis.]

I'm just not sure I clearly see the best path to reach these kinds of goals.

And it's not just me, obviously we have some other people who may have even cooler goals than I do ... so we want to create a flexible and extensible enough solution to be able to reach those additional goals as well.


More information about the i18n mailing list