To use GoCenter:
export GOPROXY=https://gocenter.io

github.com/emiruz/textextract

textextract is a tiny library (87 lines of Go) that identifies where the article content is in a HTML page (as opposed to navigation, headers, footers, ads, etc), extracts it and returns it as a string. Like Boilerpipe but for Go in Go.
June 14th 2020
Last Modified
10
Stars
MIT
License
3
Downloads
Versions (0)
Loading...