`_Url` to inherit from `str`

There was a previous discussion about this before in one of the PRs.

I'm re-opening this for tracking since this part of `w3lib.util.to_unicode` breaks: https://github.com/scrapy/w3lib/blob/master/w3lib/util.py#L46-L49

In particular, doing something like:

```python
from scrapy.linkextractors import LinkExtractor

link_extractor = LinkExtractor()
link_extractor.extract_links(response) 
```

where `response` is a `web_poet.page_inputs.http.HttpResponse` instance and not `scrapy.http.Response`.

The full stacktrace would be:

```python
File "/usr/local/lib/python3.10/site-packages/scrapy/linkextractors/[lxmlhtml.py](http://lxmlhtml.py/)", line 239, in extract_links
    base_url = get_base_url(response)
  File "/usr/local/lib/python3.10/site-packages/scrapy/utils/[response.py](http://response.py/)", line 27, in get_base_url
    _baseurl_cache[response] = html.get_base_url(
  File "/usr/local/lib/python3.10/site-packages/w3lib/[html.py](http://html.py/)", line 323, in get_base_url
    return safe_url_string(baseurl)
  File "/usr/local/lib/python3.10/site-packages/w3lib/[url.py](http://url.py/)", line 141, in safe_url_string
    decoded = to_unicode(url, encoding=encoding, errors="percentencode")
  File "/usr/local/lib/python3.10/site-packages/w3lib/[util.py](http://util.py/)", line 47, in to_unicode
    raise TypeError(
TypeError: to_unicode must receive bytes or str, got ResponseUrl
```

Other alternatives could be adjusting Scrapy code instead to cast `str(response.url)` for every use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`_Url` to inherit from `str` #187

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

_Url to inherit from str #187

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`_Url` to inherit from `str` #187