Background
In #154, we addressed the issue of missing pages in the search index. The root cause was that some index entries lacked a title property (sdesc). We resolved this by applying the same solution used for the manual content: using the description (ldesc) as the title.
To avoid duplicating text in both the title and description fields, we now pull the description from the parent <book>. For example:

In this example, the title "Type" was taken from the page description, while the new description ("Language Reference") comes from the parent <book>. You can see the implementation here:
|
if ($index["sdesc"] === "" && $index["ldesc"] !== "") { |
|
$index["sdesc"] = $index["ldesc"]; |
|
|
|
$parentId = $index['parent_id']; |
|
// isset() to guard against undefined array keys, either for root |
|
// elements (no parent) or in case the index structure is broken. |
|
while (isset($this->indexes[$parentId])) { |
|
$parent = $this->indexes[$parentId]; |
|
if ($parent['element'] === 'book') { |
|
$index["ldesc"] = Format::getLongDescription($parent['docbook_id']); |
|
break; |
|
} |
|
$parentId = $parent['parent_id']; |
|
} |
|
} |
Issue
Some entries, like extension main pages (e.g. book.strings, book.zip) and top-level pages (e.g. copyright, getting-started, security), don’t have a parent <book>. In these cases, the description is being reused as the title, resulting in duplicate content:



Proposed fix
While some entries lack a parent <book>, every entry has at least one parent <set>. The root entry itself is a set called "PHP Manual".
The proposed solution is to fall back to the first <set> in the hierarchy when no <book> is found:



I have a working implementation and will submit a PR soon.
Background
In #154, we addressed the issue of missing pages in the search index. The root cause was that some index entries lacked a title property (
sdesc). We resolved this by applying the same solution used for the manual content: using the description (ldesc) as the title.To avoid duplicating text in both the title and description fields, we now pull the description from the parent
<book>. For example:In this example, the title "Type" was taken from the page description, while the new description ("Language Reference") comes from the parent
<book>. You can see the implementation here:phd/phpdotnet/phd/Package/PHP/Web.php
Lines 244 to 258 in 673b2da
Issue
Some entries, like extension main pages (e.g.
book.strings,book.zip) and top-level pages (e.g.copyright,getting-started,security), don’t have a parent<book>. In these cases, the description is being reused as the title, resulting in duplicate content:Proposed fix
While some entries lack a parent
<book>, every entry has at least one parent<set>. The root entry itself is a set called "PHP Manual".The proposed solution is to fall back to the first
<set>in the hierarchy when no<book>is found:I have a working implementation and will submit a PR soon.