Conversation
|
It looks like the core issue is how to store and use cookies from previous requests .Right? With Req, as far as I understand, the user has to manually pass the cookie jar as a parameter, right? Tesla allows the same thing: https://hexdocs.pm/tesla/Tesla.html#request/2. BTW this issue would deserve an explicit explanation in the README. Something like However, and I'm feeling that you might not be 100% clear how to achieve that: either having the user doing the work of managing the cookie jar and update it, or having a reference to the cookie jar set up once and for all (a req / Tesla client creation) to make it more user-friendly, and therefore having a separate (out of the process / client) cookie jar. Am I right? If so I might have some suggestion. |
Yeah, the core issue is how to pass the cookie jar between requests.
I like what we do for Req, where the user needs to do a tiny bit more work but the functionality is simpler and more composable. It does currently rely on an implementation detail, but this was suggested by @wojtekmach so I'm pretty sure he'll try to keep our use case supported even if the implementation details change.
Having the cookie jar set globally is not ideal for a number of reasons:
That being said, I want to make the UX good for Tesla users as well. What feels natural for Req might feel unnatural for Tesla and vice-versa, given the different philosophies they take so I'm curious what your suggestion is. Is something like this what you hinted at above? client = Tesla.client([{HttpCookie.TeslaMiddleware, jar: HttpCookie.Jar.new()}])
result = Tesla.get!(client, "https://example.com")
# where in `%Tesla.Env{}` should the jar be stored in? `opts` feels doable, but hacky
updated_jar = extract_jar(result)Conceptually I like the approach, but:
@tanguilp what's your suggestion for this? |
Still thinking but one additional question: aren't there some scenarios when we want the same cookie jars for many processes? I'm thinking of web scraping, simulating many browser tabs opened for example. EDIT: how easy would it be doable with the current architecture?
Makes me think of "buckets". Except for the host thing: I think there are some rules about when to send cookies, depending on the host. |
Yeah, definitely. In that case you can wrap the cookie jar in an elixir process and serialize the writes/reads, essentially use the same HttpCookie.Jar.Server process across the workload. The cookie jar access will be naturally serialized because you have a single GenServer handling reads/writes for the cookie jar.
Sorry I was a bit imprecise here, you're right about the host thing. This is well specified in the RFC and HttpCookie follows the RFC closely and the user doesn't have to worry about it. And for the specific thing with using different cookie jars for different use cases I was thinking of was around mimicking how existing software (e.g. mobile apps). You'll sometimes see that the mobile app accesses a service from two different places (e.g. main app code and some library) and the server side code is written in a way that requires the two cookie jars be separated and the cookies don't mingle. Now thankfully, this isn't a common case but it's just an example where this kind of flexibility comes in handy. There might be more cases like these, though. |
|
2 other points: Prior artOTP's Use with flowIt'd be interesting to see if your current solution with Req would work along with Flow. Happened to me not long ago, I wanted to parallelize HTTP requests to an API like this: get_id_stream_from_db()
|> Flow.from_enumerable()
|> Flow.map(fn id -> http_get_with_cookie_jar_handling(id) end)
|> Flow.run()In this scenario:
ConclusionI'm inclined to think there is wisdom in httpc's decision to have cookie jar handled in another process. I also think cookie jar store should, at a minimum, a behavior and be extensible. For instance, httpc's implementation handle persistence to disk and this could indeed be needed in some scenarios (or persistence to DB). How I see thinks for better extensibility we'd need:
Regarding stores I'd see:
Whatever the implementation, you'd indicate which bucket/profile to use when creating the HTTP client (tesla: My 2 cents but I've been long enough for today 😄 |
|
I think there's definitely some gaps in documentation which can be improved going forward. Usage with Flow should be fine if access is serialized behind a GenServer (same There's no process dictionary being used in HttpCookie at the moment, btw. As you noted, there's a lot of different use cases and I'd like to keep the lib simple and flexible. We don't need easy support for every niche use case, but I do want to make it possible to support a wide range of use cases. Very unusual requirements might need some extra code on top of HttpCookie (e.g. buckets and similar). For persistance to disk, there is already a way to do it right now although there's no docs for it: Thanks @tanguilp, appreciate your feedback. I think this support will be fine for now and we can iterate on it once it gets more real world usage. |
|
I'll merge this now because I think it's simple enough while providing enough functionality and allowing extension later. |
|
Looks great. Keep up the great work! |
Adds a basic Tesla middleware so it's easy to use with the Tesla HTTP client.
AFAICT, Tesla doesn't have a (non-hacky) way of storing private data inside
%Tesla.Env{}.To make this possible, I introduced a thin GenServer called
HttpCookie.Jar.Serverto manage the cookie jar.I haven't used Tesla before, so maybe there's a better way I just don't know about.
This PR is meant to resolve #7.