wrong multipart charset

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

wrong multipart charset

getagrip
Looking at

BlockingHttp.execute(host: HttpHost, req: HttpRequestBase)

I see that the req object sends multipart entities as US-ASCII, dropping special characters like öäüß and
sending them as question-marks. The chain is as follows:

req.entity.multipart.charset -> US-ASCII
req.entity.multipart.parts    -> org.apache.http.entity.mime.FormBodyPart
part.header                      -> Content-Disposition: form-data, charset=US-ASCII
part.body                         -> org.apache.http.entity.mime.content.StringBody

The passed literal contains the wrong characters (??? instead of äää) =>
part.body.content ->

63, 63, 63, 46, 112, 100, 102
  ?    ?    ?    .     p     d     f

This should be äää.pdf instead of ???.pdf though. The literal is valid utf-8:

val request = :/(server, port) / suffix
post = request << (map += ("literal.id" -> "äää.pdf") <<* ("repo", url, () => stream)
http(post / "update/extract")
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wrong multipart charset

n8han
Administrator
Thanks, I opened an issue for it:
https://github.com/n8han/Databinder-Dispatch/issues/24

I may not be able to look at this immediately; if someone can investigate and fix that would be great.

Nathan

On 04/18/2011 09:49 AM, getagrip [via Databinder] wrote:
Looking at

BlockingHttp.execute(host: HttpHost, req: HttpRequestBase)

I see that the req object sends multipart entities as US-ASCII, dropping special characters like öäüß and
sending them as question-marks. The chain is as follows:

req.entity.multipart.charset -> US-ASCII
req.entity.multipart.parts    -> org.apache.http.entity.mime.FormBodyPart
part.header                      -> Content-Disposition: form-data, charset=US-ASCII
part.body                         -> org.apache.http.entity.mime.content.StringBody

The passed literal contains the wrong characters (??? instead of äää) =>
part.body.content ->

63, 63, 63, 46, 112, 100, 102
  ?    ?    ?    .     p     d     f

This should be äää.pdf instead of ???.pdf though. The literal is valid utf-8:

val request = :/(server, port) / suffix
post = request << (map += ("literal.id" -> "äää.pdf") <<* ("repo", url, () => stream)
http(post / "update/extract")



If you reply to this email, your message will be added to the discussion below:
http://databinder.3617998.n2.nabble.com/wrong-multipart-charset-tp6283690p6283690.html
To start a new topic under Databinder, email [hidden email]
To unsubscribe from Databinder, click here.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: wrong multipart charset

getagrip
I created a patch so that Mime.scala uses utf-8 instead of ascii per default.

I cannot find an upload-button, so I'll just paste it here:

--- F:/download/Mime_orig.scala Sa Apr 16 15:24:40 2011
+++ F:/download/Mime.scala Mo Apr 18 20:37:33 2011
@@ -3,7 +3,7 @@
 import java.io.{FilterOutputStream, OutputStream}
 import org.apache.http.HttpEntity
 import org.apache.http.entity.HttpEntityWrapper
-import org.apache.http.entity.mime.{FormBodyPart, MultipartEntity}
+import org.apache.http.entity.mime.{FormBodyPart, MultipartEntity, HttpMultipartMode}
 import org.apache.http.entity.mime.content.{FileBody, StringBody, InputStreamBody, ContentBody}
 
 import java.io.{File, InputStream}
@@ -48,7 +48,7 @@
       r.body.map {
         case ent: Mime.Entity => ent
         case orig: FormEntity =>
-          (new MultipartEntity with Mime.Entity).add(orig.oauth_params)
+          (new MultipartEntity(HttpMultipartMode.STRICT, null, java.nio.charset.Charset.forName("utf-8")) with Mime.Entity).add(orig.oauth_params)
         case ent => error("trying to add multipart content to entity: " + ent)
 
       } getOrElse new MultipartEntity with Mime.Entity
@@ -69,7 +69,7 @@
   trait Entity extends HttpEntity with FormEntity {
     def addPart(name: String, body: ContentBody)
     def add(values: Traversable[(String, String)]) = {
-      for ((name,value) <- values) addPart(name, new StringBody(value))
+      for ((name,value) <- values) addPart(name, new StringBody(value, "text/plain", java.nio.charset.Charset.forName("utf-8")))
       this
     }
     def oauth_params = Nil
Loading...