Grokbase
Topics Posts Groups | in
x
[ help ]

Jesper Krogh (j...@krogh.cc)

Profile | Posts (73)

User Information

Display Name:Jesper Krogh
Partial Email Address:j...@krogh.cc
Posts:
73 total
7 in Catalyst Framework
37 in DBIx::Class
11 in dbix-class@lists.scsys.co.uk
1 in Perl 5 Porters
1 in PostgreSQL - Admin
4 in PostgreSQL - Novice
9 in PostgreSQL - Performance
3 in PostgreSQL - SQL

5 Most Recent

All Posts
1) Jesper Krogh Re: [Xapian-discuss] Feature request: Ligthen pressure on backup
| +1 vote
This is excactly what I do. Can you elaborate a bit more on this? I currently do put in the...
Xapian
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
James Aylett wrote:
> On Mon, Mar 24, 2008 at 07:07:38AM +0100, Jesper Krogh wrote:
>
>> Another solution could be to let Xapian query several databases and
>> "merge" the result. Then I could make a new database each day and merge
>> once a week (or another timepattern that would fit the purpose).
>
> Jesper - Xapian can query multiple databases. You have to manage
> yourself which database you write into, but a database per day or
> similar would allow this. (You could perhaps merge fully shortly
> before you would do a level 0 backup anyway.)
>
> If you're doing this kind of index-to-new-then-merge strategy (which
> some people use for the different challenge of live indexing with high
> search load), then the xapian-compact(1) command will probably be
> helpful to you.
>
> Note that if you ascribe external meaning to Xapian document ids (for
> instance referencing them in a relational database), you may need to
> change things a little (such as by bringing external ids into Xapian
> and storing them in the document data, ie reversing the dependency)
> because of the way multiple database support is implemented.

This is excactly what I do. Can you elaborate a bit more on this?

I currently do put in the database-id into the document, but it doesn't
serve any purpose. I do put in an unique term into the term-set of the
documents that make replace-by-term work flawlessly. Can I preserve this
during a merge (that I have a "unique-term" in the documents)?

> You may want to look at:
> <http://xapian.org/docs/admin_notes.html#backup-strategies>
>
> and
>
> <http://xapian.org/docs/admin_notes.html#merging-databases>
> for some other notes that may be of use here.

Thanks alot for the hints.

--
Jesper Krogh

_______________________________________________
Xapian-discuss mailing list
[email protected: Xapian-di...@lists.xapian.org]
http://lists.xapian.org/mailman/listinfo/xapian-discuss
2) Jesper Krogh [Xapian-discuss] Feature request: Ligthen pressure on backup
| +1 vote
Hi. This i a small feature request for Xapian. Currently I have a xapian-database with >5m records,...
Xapian
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
Hi.

This i a small feature request for Xapian. Currently I have a
xapian-database with >5m records, the files fills around 124GB in the
Xapian catalog. With a few "quite large" files:

# du -sh *
0       flintlock
4.0K    iamflint
1000K   position.baseA
63G     position.DB
716K    postlist.baseA
624K    postlist.baseB
45G     postlist.DB
8.0K    record.baseA
385M    record.DB
240K    termlist.baseA
15G     termlist.DB
12K     value.baseB
696M    value.DB

(And it is my impression that I have a quite small record.DB-file)
The idea comes from PostgreSQL's filesystem layout, it has a (probably
historic) filesize of 2GB, but it helps the backup significantly.

This layout, gives some "challenges" to backup systems since the daily
incremental runs basically now has to backup the complete set => 124GB
even if only a single new document has been merged.

The suggesting would be to split the files in several smaller files. I
know that the algorithms for searching the binary trees probably would
be a bit more complex, but it could result in that changes only touches
a subset of the files, thus letting the backup proceed easier.

Another solution could be to let Xapian query several databases and
"merge" the result. Then I could make a new database each day and merge
once a week (or another timepattern that would fit the purpose).

Other suggestions are welcome.

Thanks.

Jesper
--
Jesper

_______________________________________________
Xapian-discuss mailing list
[email protected: Xapian-di...@lists.xapian.org]
http://lists.xapian.org/mailman/listinfo/xapian-discuss
3) Jesper Krogh Re: [Xapian-discuss] Xapian Debian/Ubuntu repositiories does not work (APT does not handle redirects!)
| +1 vote
Thats a downright bogus argument.. the archive should be properly digitally signed to provide the...
Xapian
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
Marcus Rueckert wrote:
> On 2008-02-15 16:22:44 +0000, James Aylett wrote:
>> On Thu, Feb 14, 2008 at 11:30:43AM +0000, Olly Betts wrote:
>>
>>> This is the second problem which this new redirect has caused, and
>>> expecting everyone to update their sources.list is rather an imposition,
>>> so I think we should probably remove it (for now at least) and return to
>>> www.xapian.org working as it used to.
>> Changed, very grudgingly. apt can't handle redirects? Jesus, we'll be
>> finding it can't handle dependencies properly next. Oh wait, that's
>> already true.
>
> we just had the same problem with apt4rpm. the official answer was that
> an admin could redirect users to malicious servers if apt would support
> redirects. what a false sense of security.

Thats a downright bogus argument..  the archive should be properly 
digitally signed to provide the real security.

--
Jesper


_______________________________________________
Xapian-discuss mailing list
[email protected: Xapian-di...@lists.xapian.org]
http://lists.xapian.org/mailman/listinfo/xapian-discuss
4) Jesper Krogh Re: [Xapian-discuss] flush problem
| +1 vote
Not that I have a solution, but I have a similar problem with my Xapian database. (doccount...
Xapian
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
Michael A. Lewis wrote:
> I am having a problem with flushing a database. I am adding N records
> to the DB (which amounts to 1 - 2000). At then end of the run, I
> issue a flush() call. The problem is that the flush call never seems
> to do anything. Every 10000 additions to the database and the library
> performs a flush (which can take up to 3 hours on a 560,000 document
> database) as if my flush call was never performed.

Not that I have a solution, but I have a similar problem with my Xapian
database. (doccount 8millions) flushtime is fairly long (over 10 minutes
on a 16 SAS disk array for 1000 documents added) and monitoring vmstat
(and top) I can see that it neither saturates 1 cpu or anything near the
block input/output that the disk can deliver (uses around 5MB/s in
block/in and out), viewing "top" only around 8-12% IO wait.

All of above is measured when Xapian is "flushing".

Still running xapian 1.0.4 (with perl-bindings)

--
Jesper

_______________________________________________
Xapian-discuss mailing list
[email protected: Xapian-di...@lists.xapian.org]
http://lists.xapian.org/mailman/listinfo/xapian-discuss
5) Jesper Krogh [Catalyst] Catalyst::Utils::env_value ?
| +1 vote
Hi Can anyone help me find out what my problem is here? $ perl -MDevel::SimpleTrace...
catalyst@lists.scsys.co.uk
[ Profile | Reply to group ] [ Flat  Thread  Threaded ]
Hi

Can anyone help me find out what my problem is here?

$ perl -MDevel::SimpleTrace script/efam_server.pl
Undefined subroutine &Catalyst::Utils::env_value called
         at 
Catalyst::Plugin::ConfigLoader::get_config_path(/usr/share/perl5/Catalyst/Plugin/ConfigLoader.pm:165)
         at 
Catalyst::Plugin::ConfigLoader::find_files(/usr/share/perl5/Catalyst/Plugin/ConfigLoader.pm:109)
         at 
Catalyst::Plugin::ConfigLoader::setup(/usr/share/perl5/Catalyst/Plugin/ConfigLoader.pm:52)
         at Catalyst::setup(/usr/share/perl5/Catalyst.pm:851)
         at 
<eval>(/net/atlas.nzcorp.net/z/fx1200/bio/home/jk/NzDB/devel/script/../lib/Efam.pm:49)
         at main::(script/efam_server.pl:59)
Compilation failed in require at script/efam_server.pl line 59.

Efam.pm:49 is
__PACKAGE__->setup();


Jesper

spacer
Profile | Posts (73)
Home > People > Jesper Krogh