Everyone knows that if you want to search for a package in Gentoo portage, emerge -s
is a real PITA, because is so slooooooooow.
The solution is named Eix, and you can install it with a simple
emerge eix
and, after the package has been emerged, issue a
update-eix
to sync EIX DB with portage or even a
eix-sync
to automatically do an emerge --sync && eix-sync
Now, you can do your search with a simple
eix $pattern
and it will be lightning-fast! eix –help for a load of neat options :)
I prefer to use the search funcion available on some Gentoo Portage websites….
The problem is that recently http://packages.gentoo.org removed the search function, so…
Moreover, if you want to search for a package installed locally on your sytems, emerge -s or emerge -pv are ways slower than eix
It’s a lot faster, but not by much.
gentoox css # time emerge -s apache >/dev/null
real 0m1.058s
user 0m0.940s
sys 0m0.104s
gentoox css # time eix apache >/dev/null
real 0m0.051s
user 0m0.040s
sys 0m0.000s
Your emerge seems really fast :P because this is what I get on my desktop system:
time emerge -s apache > /dev/null
real 0m19.920s
user 0m1.512s
sys 0m0.364s
time eix apache > /dev/null
real 0m0.255s
user 0m0.040s
sys 0m0.008s
Pay attention, this is a FIRST run, as it should be. In a benchmark like this it’s the first run that matters, cause we are benchmarking a program, not a platform.
Starting from the 2nd run my numbers are very close to yours, but if I change the query, here we are again, with eix blasting emerge.
It’s probably because the system i’m running it on is setup on a very fast raid10 with 10k rpm SCSI disks. That might have something to do with it =)
Yeah, emerge with its directory-based search system is heavily I/O bound… so eix is generally better because it’s using hashes (I suppose)
Hi ..
I have to disappoint you, we don’t use hashes.
eix only reads the binary cache generated by update-eix. That cache is a list of tightly packed binary records. Every record holds the (reduced, we don’t store everything) data for one package.
When eix searches, it only reads and interprets as much of a record as it needs to determine if it matches. If a record doesn’t match, we skip the rest of the record and continue with the next.
You can see that this brings us some performance if you compare “eix -H eix” with “eix -s eix”. “-H” will search the “homepage”-field, which is one of the last fields inside a record. OTOH, “-s” just checks the name, which is the first field. [1]
Also, our cache is a relatively small file that has _probably_ no or very little fragmentation in comparison to the cache directories of portage. For example, update-eix uses those directories and you know that update-eix takes a whole lot more time than eix.
You should also note that portage does more work. It actually checks that validity of its cache. So if you change a ebuild, portage will catch that and show you the correct information. eix would show you the obsolete information.
HTH, Emil
[1] The order of the fields inside a record can be seen here: https://projects.gentooexperimental.org/eix/browser/trunk/src/database/package_reader.cc?rev=504#L20
Emil, thanks a lot for your clarification and explications. Nice to have you on this blog :)