_, Path(_, _), and match order

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

_, Path(_, _), and match order

chris_lewis
Happy Thanksgiving!

I'm hacking on a project with Unfiltered and I spent a couple hours trying to figure what at first seemed to be erratic behavior in the JsonBody extrator. What I've finally realized that I apparently don't understand all I need to about pattern matching. Consider the test plan in the json module specs; something like this:

case POST(UFPath("/", JsonBody(js, _))) => ...
case _ => ...

This test passes fine, and so I was boggled when my app failed miserably. What it came down to was what I thought to be an insignificant difference; the json module spec can be made to fail in the same way with the following modification:

case POST(UFPath("/", JsonBody(js, _))) => ...
case UFPath(_, _) => ...

Now the last case always gets called, even though the first one appears to be more specific. Can someone explain why this is? My understanding of matches is that the case statements are evaluated in order, and as soon as one matches, evaluation stops. As such, I am still puzzled at why UFPath(_, _) is ever considered. Any insight on this is greatly appreaciated.

Thanks for the continued good work!

-chris
Reply | Threaded
Open this post in threaded view
|

Re: _, Path(_, _), and match order

daggerrz
You're right in your understanding of how the pattern matching is supposed to work.

I'm a little short of time right now, but was intrigued and had to take a quick look.

The problem here seems to be that Bytes extractor has side-effects here. As usual, the pattern match is called twice - once during the isDefinedAt call and second when the actual extraction occurs. The first call to Bytes.unapply consumes the InputStream which causes the subsequent to Bytes.unapply to return an empty Byte array and the a None:

    bos.toByteArray match {
      case Array() =>
        None
      case ba =>
        Some(ba, req)
    }

I'm surprised that this works if the seconds pattern is _, but there might be some optimization going on.

I'm very short on time right now, but the best solution for this might be to make HttpRequest trait's inputStream method to return a pristine stream on each call (by buffering and copying it locally). This clearly illustrates the inefficiency of pattern matching, though, and the Lift and Scala lang community has been discussing ways to avoid the double unapply sequence (there's some relevant info at http://www.assembla.com/wiki/show/liftweb/Memoizing_Requests_in_a_DispatchPF).

Nathan will probably have some more insights.

BTW: I did a very quick test to confirm this by changing the request binding for filter to:

 lazy val bytes = {
    val in = req.getInputStream
    val bos = new java.io.ByteArrayOutputStream
    val ba = new Array[Byte](4096)
    /* @scala.annotation.tailrec */ def read {
      val len = in.read(ba)
      if (len > 0) bos.write(ba, 0, len)
      if (len >= 0) read
    }
    read
    in.close
    bos.toByteArray
  }
  def inputStream = new java.io.ByteArrayInputStream(bytes)
Reply | Threaded
Open this post in threaded view
|

Re: _, Path(_, _), and match order

soft_props
I think the pattern we want to opt for is no usages of unapply (extractors) should every cause side effects. I think the way we want to avoid that is to pull what causes the side effect out of the extractor see this commit [1] for example.

I think we should opt for changing JsonBody's unapply to an apply and pulling out the filter on accepts application/json and leave that up to the enduser to chose if they want to constrain the request on that condition or not. I think I've had at least one other complaint about JsonBody's constraint being to ridged before so I think it would be a good thing to separate that from the actual work of JsonBody (parsing). Pulling things like that apart usually leads to more flexibility anyway.

I just pushed an example to a json-no-side-effects branch on gh [2] I'll talk with Nathan about it this weekend.

to reproduce the same flavor of filtering as the JsonBody extractor did before you can just do something like

case POST(UFPath("/", Accepts.Json(r)) => ResponseString(JsonBody(r) match {
  case Some(net.liftweb.json.JsonAST.JArray(a :: b :: Nil)) => "array of 2"
  case _ => "expected json array of 2"
})

or now you can relax the accepts constraint by just doing

case POST(UFPath("/", r) => ResponseString(JsonBody(r) match {
  case Some(net.liftweb.json.JsonAST.JArray(a :: b :: Nil)) => "array of 2"
  case _ => "expected json array of 2"
})

see unfiltered.request.JsonBodySpec in the json-no-side-effects branch for more details.

I think we talked before about memorizing access to the body of a request and the result was that we didn't want to loose the efficiency of streaming the request for large request bodies in memory vs loading the whole body in memory.

[1]: https://github.com/n8han/Unfiltered/commit/1c494fac125a2f8da59057ba9f37e39c4aca4e06
[2]: https://github.com/n8han/Unfiltered/commit/6dfd71927f40c120aebcdcf9433af970278acbe3
Reply | Threaded
Open this post in threaded view
|

Re: _, Path(_, _), and match order

chris_lewis
In reply to this post by daggerrz
I had forgotten about the double hit on extractors, but had confirmed that (somehow) the input stream was being consumed; I just wasn't sure why _ didn't cause the same behavior (I'd still like to figure out why, exactly).

I think I'm with Doug on the solution to this, although I'll miss having the JSON extracted for me. On the subject of JsonBody, I wonder why it matches on Accept rather than Content-Type. The Accept header indicates what the client wants, not what it's providing - that's what Content-Type is for, no?

Thanks for the speedy and thorough responses!

-chris
Reply | Threaded
Open this post in threaded view
|

Re: _, Path(_, _), and match order

soft_props
I agree on the Content-Type note. Good catch.
Reply | Threaded
Open this post in threaded view
|

Re: _, Path(_, _), and match order

n8han
Administrator
In reply to this post by soft_props
On 11/26/2010 01:28 PM, soft_props [via Databinder] wrote:
> I think we should opt for changing JsonBody's unapply to an apply and
> pulling out the filter on accepts application/json and leave that up
> to the enduser to chose if they want to constrain the request on that
> condition or not. I think I've had at least one other complaint about
> JsonBody's constraint being to ridged before so I think it would be a
> good thing to separate that from the actual work of JsonBody
> (parsing). Pulling things like that apart usually leads to more
> flexibility anyway.

Yeah. With 0.2.1 we stopped trying to hack around multiple tests of the
the pattern matching expressions, and instead went for zero side effect
extractors. Just missed a spot here.

Nathan