• Requirements for permalink /%postname%/

    You’re probably not reading the post you thought you clicked on. Scroll down for the explanation.

    When I wrote the post entitled Requirements for permalink /%postname%/ I didn’t realise that in this site my permalink structure was already set to /%postname%/. This means that I can demonstrate the problem.

    The above link was created by inserting a link to the post, not the page. But if you click on the link you should end up back here, on the page.

    Current post’s fields

    21124publishrequirements-for-permalink-postnamepage

    Duplicate posts

    1 to 2 of 2
    Title ID Post Status Post Name Post Type
    Requirements for permalink /%postname%/ 21123 publish requirements-for-permalink-postname post
    Requirements for permalink /%postname%/ 21124 publish requirements-for-permalink-postname page

    Can we create a link to the post?

    If you click on this link, which is the post ID, you would expect to be shown the correct post wouldn’t you?

    No. That doesn’t work!

    How about adding the post type?

    /?p=21123&post_type=post

    Requirements for permalink /%postname%/

    How about adding a prefix, as if the permalink structure was /%year%/%postname%/ ?

    /2022/?p=21123&post_type=post

    Requirements for permalink /%postname%/

    Clutching at straws…

    Results for different values of Status and Visibility

    OK, so what about when we change the status of this page? Do we get to see the post?

    Status Logged in? Logged out?
    Published (publish) No 404
    Draft (draft) No 404
    Pending (pending) No 404
    Scheduled (future) No 404
    Deleted (trash) Yes Yes
    Visibility: Private (private) No 404
    Visibility: Password protected (publish / future ) No 404
    Do we see the post? Actual results with current solution.

    The original post

    For the time being, I can’t find a way to make a link to the post actually resolve to the post. So here it is – embedded directly into this page using a shortcode that references the post by its ID.


    Requirements for permalink /%postname%/

    At the WordPress Portsmouth Online Meetup, on 16th March 2022, we had a long discussion about a bug that was raised 12 years ago. It’s never been closed and hasn’t been worked on for a while. The issue in question is #13459 Conflict between post and page slugs/permalinks when permalink setting is set to /%postname%/

    In a nutshell the problem can be summarised by one of these two quotes.

    I clicked on the link and got shown the wrong content.

    I clicked on the link and got a 404 Not found.

    • The issue is assigned to the Permalink component.
    • There have been 16 duplicates of this issue.
    • Each duplicate has been closed, even though the problem hasn’t been fixed. That’s Business As Usual.
    • Abha suggested we chivvy it along by raising it at the weekly bug scrub.
    • We did and got some response… more work needed.

    So here’s my effort at describing the problem, the requirements to be satisfied, and some thoughts on possible solutions that may work even when there are duplicate slugs between post types.

    Reproducing the problem

    The problem occurs when the site is configured with permalinks that just use the postname. Also known as the slug, this is the field in the post that contains what is supposed to be a unique identifier for the post. Unfortunately, it turns out that it’s not unique and this can lead to unexpected results when clicking on permalinks to the website’s content.

    The permalink of this post is currently requirements-for-permalink-postname, since it’s been automatically generated from the post’s title. If I were to have my permalink structure set to /%postname%/ and I were to create a page or other custom post type with the same title… and hence the same slug, then attempting to view the post can lead to me seeing the page, not the post.

    For the majority of users this is an unexpected result.

    If I were to make the page private, then users who aren’t logged in would get the 404.

    These are also unexpected results. The user knew the post was there, but wasn’t shown it.

    Steps to reproduce the problem

    1. Set permalinks to /%postname%/
    2. Create a page called XXX
    3. Create a post called XXX
    4. View posts archive
    5. Choose XXX

    Expected result: The post called XXX

    Actual result: The content of the page called XXX

    You can actually reduce the number of steps to produce the problem.

    1. Set permalinks to /%postname%/
    2. Create a post with the same name as an existing page.
    3. Save the post
    4. View it.

    Expected result: The post

    Actual result: The page

    Fun with post status

    Once you’ve published the post you’ll find that you can’t even Preview it. This is because the URL for the Preview uses the permalink eg https://herbmiller.me/requirements-for-permalink-postname/?preview=true.

    You’ll find that you can preview the post when it’s a Draft or Scheduled post. What’s more surprising, is that when you’re logged in you can see the post if its status is Draft or Scheduled ( future ).

    This is because the URL uses the post ID rather than the permalink… but it only works when you’re logged in and the post isn’t (yet) published!

    Summary of posts with this post name

    This post: Requirements for permalink /%postname%/

    This post’s fields: 21123publishrequirements-for-permalink-postnamepost

    1 to 2 of 2
    Title ID Post Status Post Name Post Type
    Requirements for permalink /%postname%/ 21123 publish requirements-for-permalink-postname post
    Requirements for permalink /%postname%/ 21124 publish requirements-for-permalink-postname page

    Requirements

    The basic requirement is to be able to satisfy the user’s request to view the content they were offered.

    The additional requirement implied by the /%postname%/ permalink structure is

    • either for WordPress to prevent duplicate URLs, when using this permalink structure.
    • or for WordPress to resolve duplicates using a documented / logical algorithm.

    For completeness, the solution should work

    • for all permalink structures
    • for all hierarchical permalinks, for posts, pages, attachments, taxonomies and Custom Post Types (CPTs).

    There are additional requirements to be satisfied:

    • pages are allowed to have non-unique slugs across hierarchies
    • CPTs are allowed to have non-unique slugs across hierarchies
    • solution should work when non-unique items have been trashed
    • solution should support paginated content

    Test cases

    Prior to developing any automated test cases I believe it’s necessary that we document the requirements clearly enough for expected results to be articulated and agreed. One way of achieving this is to document the scenarios where the wrong result is produced for the given input, what the correct result should be and why. In other words, to clearly document what we expect.

    This is quite a challenge as there are so many combinations. The problem is not limited to posts and pages. It also extends to attachments and custom post types.

    Other custom permalinks don’t work

    The problem is not limited to /%postname%/. The table below summarises the results obtained with several custom permalinks.

    Custom permalink Works? Comments
    /%postname%/ No See above.
    /%postname%/%post_id%/ No Different results. Given that the permalink contained the post ID, these results were even more unexpected. Sometimes we get a 404.
    /%post_id%/ Yes But very unsatisfactory URL
    /%post_id%/%post_name%/ Yes Fairly unsatisfactory URL
    /-/%post_name%/ Sort of It works for posts but not for the Attachment scenario.
    /%year%/ No The date archive for year is displayed
    Any other combination not including /%postname%/ or /%post_id%/ No You’ll get some archive display for every post you click on.
    Custom permalink structure and duplicate permalink scenarios

    Note: I’ve not yet created / seen a scenario which applies to taxonomies and/or their permalink prefixes.

    I did however try setting the Optional Category base to a single blank character. It got converted to %20. This led to a 403 error when attempting to view the posts in a selected category.

    Similarly, entering a question mark into the Optional Tag base field led to a 404.

    Possible solutions

    Following analysis of wp_unique_post_slug() many moons ago, three options were proposed:

    1. Always prevent posts, pages (and CPTs?) from having the same slug (require unique slugs across all post types). Since having the same slug is actually fine with most permalink structures, this sounds like an unnecessary restriction.
    2. Only do the above for the /%postname%/ permalink structure. However, if the structure changes to /%postname%/ later (after the page and the post are created), we’ll still end up with a conflict.
    3. Leave this to a plugin, since wp_unique_post_slug() is filterable.

    These options focused on preventing the problem in the first place.

    An alternative approach would be to deal with the URL request taking into account the permalink structure. A third would be to detect and alter the duplicate slug on permalink creation such that when the link was clicked WordPress would find the correct post.

    I already use this technique for taxonomies which are attached to several CPTs.
    eg. https://blocks.wp-a2z.org/letters/b/?post_type=oik-plugins will display post type oik-plugins which are classified in the letters taxonomy as b.

    What happens when the URL request is being processed?

    When the request’s query is parsed, WordPress uses the rewrite rules to help it construct the query to run to find the requested content.

    For /%postname%/ tracing of the wp hook showed the WP object containing:

     [query_vars] => Array
            [page] => (string) ""
            [pagename] => (string) "trouble-with-urls-paged"
     [query_string] => (string) "pagename=trouble-with-urls-paged"
     [request] => (string) "trouble-with-urls-paged"
     [matched_rule] => (string) "(.?.+?)(?:/([0-9]+))?/?$"
     [matched_query] => (string) "pagename=trouble-with-urls-paged&page="
     [did_permalink] => (boolean) 1

    Note: The global $wp_rewrite object’s rules array doesn’t differentiate between posts and pages.

    This means that the code’s already decided to load the page. The query that was performed was invoked by `get_page_by_path()`

    SELECT ID, post_name, post_parent, post_type
                    FROM wp_posts
                    WHERE post_name IN ('trouble-with-urls-paged')
                    AND post_type IN ('page','attachment')

    This didn’t make any sense to me. Why didn’t the post_type clause include post?

    Looking at the code I found where get_page_by_path() is being called from parse_request().

    if ( $wp_rewrite->use_verbose_page_rules && preg_match( '/pagename=\$matches\[([0-9]+)\]/', $query, $varmatch ) ) {
            // This is a verbose page match, let's check to be sure about it.
            $page = get_page_by_path( $matches[ $varmatch[1] ] );
            if ( ! $page ) {
                    continue;
            }

    I believe that it’s this logic that’s finding the page and therefore ignoring the post.

    So now I need to understand why use_verbose_page_rules is set to true…. and what I might be able to do to convince WordPress to have another look for any other posts that could satisfy the user’s request.

    More investigation necessary…

    I’ve started writing a plugin that attempts to intercept the current logic. It’s looking promising in some respects, but fails in others. Basically it has a look at what WordPress has decided the query should be and overrides it, chaging pagename to name in the $query_vars array.

    I will continue with my hacky workaround, But it would be nice to see a proper solution to this issue in the not too distant future.

    In that respect I have started to develop PHPUnit test cases to test the different scenarios. See bobbingwide/dupes.



    Published:

    Last updated:

    March 21, 2022

Categories

Tide times from tidetimes.org.uk

Tide Times & Heights for Langstone Harbour on
4th December 2023
03:48 High Tide ( 4.09m )
08:48 Low Tide ( 2.03m )
15:46 High Tide ( 3.84m )
21:15 Low Tide ( 1.91m )

Tide times from tidetimes.org.uk

Tide Times & Heights for Northney on
4th December 2023
04:09 High Tide ( 3.96m )
09:06 Low Tide ( 1.86m )
16:13 High Tide ( 3.74m )
21:39 Low Tide ( 1.7m )