[Bf-docboard] Bf-docboard Digest, Vol 92, Issue 1

Ton Roosendaal ton at blender.org
Tue Oct 9 12:07:11 CEST 2012


Hi Dan,

Since you have an account, I would be very pleased to have you help with installing/upgrading mediawiki.
Is that possible?

On the 'wall of text': I would suggest to remove the Sphinx search and put a simple google search button in place. I didn't understand everything in your long text, but "Sphinx" seems to be the cause of everything that's complicated and hard to work with. Remove complexity, we don't need it.

-Ton-

------------------------------------------------------------------------
Ton Roosendaal  Blender Foundation   ton at blender.org    www.blender.org
Blender Institute   Entrepotdok 57A  1018AD Amsterdam   The Netherlands

On 8 Oct, 2012, at 18:58, Dan McGrath wrote:

> I figured that rather than replying to a bunch of seperate emails, I
> would answer this one, and also expand it by giving any future admins
> a bit of insight into the setup where possible.
> 
> On Mon, Oct 8, 2012 at 8:14 AM, Kesten Broughton
> <solarmobiletrailers at gmail.com> wrote:
>> 1) Media wiki needs upgrade, the current one seems to be instable
>> by media wiki, do you mean wiki.blender.org?
>> what instabilities have been found?
> 
> As Brecht discusses in the next email, I think it is more of a problem
> about the system not being maintained, as far as updates go, on a
> regular basis.
> 
>> 2) If you type a url, the system now demands a captcha for people who are
>> logged in even. Can this be disabled?
>> For blender, devs can check changes in the svn repository to find out what
>> change broke stuff.  Is there a similar code base for the wiki?
> 
> This is a bit of a pain spot (at least I found). Essentially, there
> are 2 svn repositories. One is a public one that anyone can access I
> believe. The second, is a private SVN repo that mindrones was using to
> actually track the live changes to the wiki code base by manually
> cherry picking changes I would push to the public one, and then push
> them into the private one, and then pull those changes into the actual
> production (or test) server.
> 
> The biggest pain spot with the private repository though, is that it
> only contains a small portion of the actual wiki that you deploy into
> the production server. So as you guys can imagine, you get a ton of
> "?" listings every time you do an `svn status` as it would appear that
> no ignore entries are/were maintained. Also, the private SVN was not
> always in sync. As a result, getting actuall changes deployed involved
> a lot of hoops to jump through.
> 
> As for the databases, there are 2 mysql db's with names that iirc were
> designed based on the type of db (production vs test) with a
> concatenated timestamp as to when they were put in place.
> Unfortunately, the databases themselves we not always in sync, and
> since a lot of the "code" of the wiki is done via templates, the
> result is that not all changes always work as expected when you deploy
> them on the test server.
> 
> To top it all off, we have the extreme complication of the Sphinx
> indexing system that is used for searches. For those who don't know,
> Sphinx is an external 3rd party indexing server that you point at a
> database, and then through configuration files, explain to it what
> fields in the tables are to be indexed, how to index them, if they are
> an enum type or a string, or even a timestamp etc. Even though Sphinx
> can index massive amounts of data just fine, the problem is the
> interface to the wiki itself.
> 
> As you can imagine, in order to use Sphinx in MediaWiki itself,
> someone had to write an extension that can access and query the Sphinx
> API from within the wiki software. While I don't recall the specifics,
> I don't think that the wiki extension itself was written by the same
> people that wrote Sphinx (although them might have had some
> involvement, I am not entirely sure tbh). Regardless, we found early
> on that the extension itself was rather lackluster overall in terms of
> functionality, features, configuration and interface that is used in
> the actual wiki for searching.
> 
> The underlying problem with the search seemed to stem from some
> decisions that were made with regards to i18n/l10n in the early days
> of the wiki. I don't remember all of the specifics atm, but my
> understanding is that in a "normal" MW install (like wikipedia), they
> use something called inter-language links that point to an actual
> seperate install of the wiki. Our install however, lumped everything
> into one massive installed, and used subpages and whatnot to seperate
> all of the languages. There is a little more to all of the specifics,
> but you hopefully get the idea.
> 
> So, to get Sphinx running and support the 20 or 30 or so languages
> that we wanted to offer in a single installations, we first had to
> reorganize (or "normalize", in DB lingo) the structure of the wiki so
> that we could isolate the individual pages in the wiki for Sphinx to
> index. Since we didn't use the inter language links but instead used
> the page title (ie: EN/MyPage/foo or FR/MyPage/foo), and that the page
> titles were all in english regardless of language etc, we had no
> access to the meta data (in this case, the language) directly. Thus
> the reorganization so that we ensured that IK (a language, and a
> feature in blender!) had to all be moved around to ensure that we
> could tell sphinx via some SQL functions that parse page titles, what
> language and version number of blender the page is so that we could
> use this data, which was not available otherwise, in actual searches.
> (Version, is what we ended up called "series", and just implies if it
> is 2.4, 2.6 etc).
> 
> To complicate things even further, we had to deal with updates to the
> Sphinx config files for dozens and dozens of languages since for
> indexing to function properly, you have to tell it what language the
> text it is scanning is in (remember, our particular wiki setup has NO
> idea due to the way things are setup, and the meta data is only
> avaiable via page title). This proved to be a problem initially (many
> lost hours and days due to simple typos!) since changing SQL in 100+
> places (30 or 40 language stanzas, times 2 changes for
> full/incremental index update SQL) was very error prone. In the end, I
> ended up using M4 macros as a template system so that I could just add
> a new language that wasn't so prone to typos, and then regenerate the
> massive config files for the sphinx process.
> 
> As for sphinx extension in mediawiki, the stock search system was not
> able to query or display the results in a fashion that we felt was
> ideal for the pages. For example, we not had implied meta data in the
> page titles, but the internal search engine had no idea about any of
> this, so it would tend to just search everything (not that it was
> slow, just inefficient). The internal search engine itself could have
> been extended to deal with the exact setup we use for page titles, but
> since the admin and mindrones were already down the path of Sphinx
> when I came along, we stuck with it.
> 
> As for the extension itself, I ultimately ended up hacking it (meant
> to be temp/short term) so that it could handle queries via http GET to
> display particular languages and/or versions (series), in addition to
> the typical namespaces that MW supports. Also, since the team made the
> decision to use a fancy tabbed layout and have the possibility to
> display all series at the same time on a search result page, I had to
> change the extension so that it could do this, since by default,
> Sphinx currently did not allow this type of search "grouping" (it
> actually did, but only 3 results per group as a limit).
> 
> The result of all this is that the search system became an overly
> complex (and resource intensive) system that needed a full proper
> rewrite, that ended up "working", but is difficult to maintain since
> it requires a large amount of access to be coordinated with the system
> administrator. Combine that was the VCS setup that is used, and even
> updating the wiki itself can be a lesson in frustration ;)
> 
> So, while I can't speak for the future of the wiki as far as search
> goes, I will say this; whom ever wishes to get involved with the long
> term maintenance of the system, needs to understand what they are
> getting into. Ultimately, I think the system should be refreshed to
> something more modern. Things like configuration management (Puppet,
> Chef, CFEngine) come to mind, or perhaps mini cluster of vm's on a
> single box to deal with test -> production (since many mistakes and
> frustrations were caused by having both on the same box due to
> external processes using the same port binds etc), but all of this
> requires time, commitment and proper planning.
> 
> Anyways, I hate to hit all of you with my wall-o-text's :) but I
> figure that it was important for me to explain this at least before
> some poor soul goes walking into some fresh new hell ;) As usual, feel
> free to contact me if you are curious or need some clarification. I
> think the server as a whole is running just fine thanks to Marco (the
> sys admin), and I am sure that between Luca, Marco and myself, would
> could answer most questions about the setup for anyone interested in
> helping out.
> 
> 
> Dan
> 
> 
> 
> 
>> 
>> kesten
>>> 
>>> 
>>> Hi,
>>> 
>>> We need a new system admin to help organizing wiki. Two issues that popped
>>> up today at the dev meeting:
>>> 
>> 
>>> 
>>> 2) If you type a url, the system now demands a captcha for people who are
>>> logged in even. Can this be disabled?
>>> 
>>> 
>>> Thanks,
>>> 
>>> -Ton-
>>> 
>>> ------------------------------------------------------------------------
>>> Ton Roosendaal  Blender Foundation   ton at blender.org    www.blender.org
>>> Blender Institute   Entrepotdok 57A  1018AD Amsterdam   The Netherlands
>>> 
>>> 
>>> 
>>> ------------------------------
>>> 
>>> _______________________________________________
>>> Bf-docboard mailing list
>>> Bf-docboard at blender.org
>>> http://lists.blender.org/mailman/listinfo/bf-docboard
>>> 
>>> 
>>> End of Bf-docboard Digest, Vol 92, Issue 1
>>> ******************************************
>> 
>> 
>> 
>> 
>> --
>> 
>> Kesten Broughton
>> President and Technology Director,
>> Solar Mobile Trailers
>> kesten at solarmobiletrailers.com
>> www.sunfarmkitchens.ca
>> 512 701 4209
>> 
>> 
>> _______________________________________________
>> Bf-docboard mailing list
>> Bf-docboard at blender.org
>> http://lists.blender.org/mailman/listinfo/bf-docboard
>> 
> _______________________________________________
> Bf-docboard mailing list
> Bf-docboard at blender.org
> http://lists.blender.org/mailman/listinfo/bf-docboard



More information about the Bf-docboard mailing list