A jQuery-like HTML manipulation and traversal library for Go, built on net/html and cascadia CSS selectors.
PuerkitoBio/goquery is a Go library that brings a jQuery-like syntax and feature set to the Go programming language, enabling developers to parse, traverse, and manipulate HTML documents. It is built on Go's net/html package and the cascadia CSS selector library, providing a familiar chainable interface for web scraping and HTML processing tasks. The library intentionally mirrors jQuery's API to leverage its widespread familiarity, prioritizing developer comfort and consistency.
Go developers who need to perform web scraping, HTML parsing, or DOM manipulation tasks, particularly those with prior experience using jQuery in JavaScript. It is also suitable for developers building data extraction tools, web crawlers, or applications that process HTML content in Go.
Developers choose goquery because it offers a jQuery-like API that is intuitive and familiar, reducing the learning curve for those already comfortable with jQuery. Its integration with Go's net/html and cascadia provides robust HTML parsing and CSS selector support, making it a reliable and efficient choice for HTML manipulation in Go compared to lower-level alternatives.
A little like that j-thing, only in Go.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Mirrors jQuery's API with chainable methods, making it intuitive for developers with jQuery experience, as highlighted in the philosophy section.
Uses the cascadia library for robust selector support, enabling precise element selection similar to jQuery's capabilities.
Built on Go's net/html package, it allows loading documents from various sources like HTTP responses and readers, with performance optimizations noted in the changelog.
The API is declared stable since v1.0.0 with regular updates, ensuring reliability and backward compatibility for production use.
Cannot parse or execute JavaScript-generated content, limiting effectiveness for modern dynamic websites, as it relies solely on static HTML from net/html.
Mandates UTF-8 encoded HTML input, placing the burden on developers to handle encoding conversions, as admitted in the README's wiki reference.
Omits stateful manipulation functions like height() and detach() due to the node-based parser, reducing full parity with jQuery's API.