Discussion:
[tex-k] kpathsea Question
dongen
2012-03-01 19:01:27 UTC
Permalink
Dear all,


When reading the kpathsea manual, some passage wasn't clear to
me. I'd appreciate it if you could have a look at this and let me
know how I should interpret the passage.

On page 27, the kpathsea manual states:

[A search path is a colon-separated list of path elements.]
...
To check a particular path element e, Kpathsea first sees if a prebuilt database
applies to e, i.e., if the database is in a directory that is a prefix of e. If
so, the path specification is matched against the contents of the database.

A few things aren't clear in the last sentence of this description. I'd be much
obliged if you could point me to a more accurate description or provide an
explanation how it works. The following are some questions I have.

* The path element e may have several prefixes. In what order are they searched?
For example, if e is equal to /a/b/c, then /a and /a/b are both prefixes of e.
* Is e a prefix of itself?

The following is a possible solution to Question 1. Could it be that there is an
implicit assumption that there may be at most one prefix of any path element
that can may an ls-R database? If so, should the directory containing the ls-R
database be in TEXMFDBS? It would make sense. For example, find
/usr/local/texlive -name ls-R gives:

/usr/local/texlive/texmf-local/ls-R
/usr/local/texlive/2011/texmf/ls-R
/usr/local/texlive/2011/texmf-dist/ls-R
/usr/local/texlive/2011/texmf-var/ls-R
/usr/local/texlive/2011/texmf-config/ls-R

and kpsewhich -var-value=TEXMFDBS gives (modulo line breaks):

{!!/usr/local/texlive/2011/texmf-config,
!!/usr/local/texlive/2011/texmf-var,
!!/usr/local/texlive/2011/texmf,
!!/usr/local/texlive/2011/../texmf-local,
!!/usr/local/texlive/2011/texmf-dist}

Thanks in advance for your help.

Regards,


Marc van Dongen
Karl Berry
2012-03-01 23:56:19 UTC
Permalink
Hi Marc,

To check a particular path element e, Kpathsea first sees if a
prebuilt database applies to e, i.e., if the database is in a
directory that is a prefix of e. If so, the path specification
is matched against the contents of the database.

A few things aren't clear in the last sentence of this
description. I'd be much obliged if you could point me to a more
accurate description or provide an explanation how it works.

Looking at the source and/or experimentation are really the only ways to
get definitive answers. I don't doubt the documentation could be
improved, though, and will try, once I fully understand what you're
getting at.

* The path element e may have several prefixes. In what order are
they searched? For example, if e is equal to /a/b/c, then /a and
/a/b are both prefixes of e.

I'm not sure if /a/ls-R or /a/b/ls-R would be found first, if both
exist. Is that what you're asking?

* Is e a prefix of itself?

Do you mean, would /a/b/c/ls-R be used to find /a/b/c/somefile? I
believe the answer is yes, but I'm not completely sure without doing the
experiment.

implicit assumption that there may be at most one prefix of any path
element that can may an ls-R database?

I guess so. That's all I ever remember seeing or using. Is it useful
to have multiple ls-R's with different prefixes? I don't see how ...

If so, should the directory containing the ls-R database be in
TEXMFDBS?

Yes. See also the comments at TEXMFDBS in texmf.cnf.

Just wondering ... what are you doing that needs explanations of such
obscure corners of system :)?

Thanks,
Karl
dongen
2012-03-02 04:50:52 UTC
Permalink
* Karl Berry <karl at freefriends.org> [2012-03-01 15:56:19 -0800]:

Hi Karl,

: Looking at the source and/or experimentation are really the only ways to
: get definitive answers. I don't doubt the documentation could be
: improved, though, and will try, once I fully understand what you're
: getting at.

Thanks for getting back so promptly. I agree about the experimentation
but I do believe the documentation should be the ultimate reference.

: * The path element e may have several prefixes. In what order are
: they searched? For example, if e is equal to /a/b/c, then /a and
: /a/b are both prefixes of e.
:
: I'm not sure if /a/ls-R or /a/b/ls-R would be found first, if both
: exist. Is that what you're asking?

Yes. There is some comment in the manual about non-deterministic behaviour
when directories are searched recursively (section 3.3.6), but not for
prefixes in paths if an ls-R database is used. I think it's important to
know about it. When I was asking the question to myself, I thought that
using the longest-prefix-first would make the most sense because then you
could ``override'' paths in ls-R files in parent directories.

: * Is e a prefix of itself?
:
: Do you mean, would /a/b/c/ls-R be used to find /a/b/c/somefile? I
: believe the answer is yes, but I'm not completely sure without doing the
: experiment.

I meant if the path element is /a/b/c, would /a/b/c/ls-R be used.

: implicit assumption that there may be at most one prefix of any path
: element that can may an ls-R database?
:
: I guess so. That's all I ever remember seeing or using. Is it useful
: to have multiple ls-R's with different prefixes? I don't see how ...

Well, if it's not stated explicitly in the manual it isn't possible then
how are we to know it isn't? It may have its uses. For example, you could
have some ``global'' ls-R database near the root and override it in some
offspring directory. (Also see the second next comment below.)

: If so, should the directory containing the ls-R database be in
: TEXMFDBS?
:
: Yes. See also the comments at TEXMFDBS in texmf.cnf.

I noticed this comment. I still think it's important to know how to
deal with the multiple prefix problem. For example, In section 3.1 it
says

``If the database does not exist, or does not apply to this path element,
or contains no matches, the filesystem is searched ...''

The way I read it this didn't rule out that if there was a path element
of the form /a/b/c// then kpathsea migh first look for ls-R databases in
/a, /a/b, and /a/b/c (in some order) and then possibly use recursive
search if the database lookups weren't successful. I think it's important
to know what happens in a case like this. I experimented with it and it
seems there's only one ls-R database that's used, but I don't want to
rely on experimentation. I need an authoritative answer, which is why I
posed the question.

: Just wondering ... what are you doing that needs explanations of such
: obscure corners of system :)?

I am writing some chapters for TeX Live users on how to install TeX Live,
install user-defined packages, and using TeXWorks/JabRef. I wanted to make
a proper job out of the package installation because it's a common question
that keeps coming up over and over again. You get answers of the form, do
such and such, but it's never explained why.

I intend to make the chapters available soon. I can send you the current
versions if you're interested. If you have some comments about the kpatsea/ls-R
related sections, then that'd be much appreciated. (There are three short of
these sections per chapter---there's one chapter per OS---and they're almost
identical modulo OS peculiarities.)

Regards,


Marc van Dongen
Karl Berry
2012-03-03 01:27:39 UTC
Permalink
Hi Marc,

but I do believe the documentation should be the ultimate reference.

In my view, the code is the ultimate reference here.

have some ``global'' ls-R database near the root and override it in
some offspring directory.

I grant you the hypothetical idea, but I still fail to see any practical
reason to ever do any such "overriding".

I need an authoritative answer,

I understand. Unfortunately, I can't give you one without doing all the
research, which I can't take the time for any time soon, if ever. I
guess the only authoritative answer I can give you is that "it's defined
to be undefined". Sorry.

Byeond that, what I can tell you is that however the code behaves now
should just be considered the intent. I never documented these unusual
cases because they never seemed relevant in the real world (and still
don't, to be honest).

If you did the experiments, I'm happy to consider them the answer :).

a proper job out of the package installation because it's a common question

Yes, it certainly is. And we've explained it as well as we can in our
documentation, I think. Alternative explanations are certainly
welcome. But if you start trying to describe multiple ls-R's along the
same tree, I rather think the result would be confusion, not clarity :).

I intend to make the chapters available soon. I can send you the
current versions if you're interested. If you have some comments
about the kpatsea/ls-R

Sure, I'm happy to look, just can't promise any given level of feedback.
I'll certainly have nothing to say about Windows. You might be best off
posting the drafts here so others can see too.

Best,
Karl

Loading...