Skip to content

Floki using the built in parser does not handle the optional closing p tag #395

@derek-zhou

Description

@derek-zhou

Description

According to HTML5 spec, closing </p> tag is optional. ie:

<p>p1
<p>p2

is equivalent to:

<p>p1</p>
<p>p2</p>

However, Floki with the builtin parser does not handle this correctly.

To Reproduce

  • Using Floki v0.32.0
  • Using Elixir v1.12.3
  • Using Erlang OTP v24
  • With this code:
Floki.parse_document("<p>p1<p>p2")
{:ok, [{"p", [], ["p1", {"p", [], ["p2"]}]}]}
iex(5)> Floki.parse_document("<p>p1</p><p>p2</p>")
{:ok, [{"p", [], ["p1"]}, {"p", [], ["p2"]}]}

It looks like Floki fills in the missing </p> at the end of the document.

Expected behavior

<p> tag shall not contain another <p>

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions