[Glass] Files and UTF8 but not using String #encodeAsUTF8

Mariano Martinez Peck via Glass glass at lists.gemtalksystems.com
Mon Mar 2 06:08:20 PST 2015


On Thu, Feb 26, 2015 at 5:38 PM, Johan Brichau <johan at yesplan.be> wrote:

> Mariano,
>
> Does this help you?
>
> |codec stream|
> codec := GRCodec forEncoding: 'utf8'.
> stream := codec encoderFor: (GsFile open: 'bla.xml' mode: 'r' onClient:
> false).
> stream binary.
> stream contents
>
> I guess you can convert this to writing easily.
>
>
Hi Johan,

Yes, it did help. However, I am still getting more errors. Your above code
works only correct for me if I send #contents. If I send #next for example,
it fails. In my case, I cannot send #contents but instead pass the stream
to SIXX and Sixx will read the stream and materialize. So sixx for example,
sends #next:. The problem is that the UTF8 magritte readers expects the
streams to answer a number (ascii value) to  #next. However, GsFile answers
an instance of character. See the attached screenshot.

I know GsFile understands #nextByte which indeed answers the ascii value,
but as you can see in the stack I don't have control over which messages
are sent, so #next: is sent.

Reproducing the error is very easy:

| stream codec  |
codec := GRCodec forEncoding: 'utf8'.
stream := codec encoderFor: (GsFile open: '/Users/mariano/test.txt' mode:
'w' onClient: false).
stream text.
stream nextPutAll: '<mariano>'.
stream flush.

codec := GRCodec forEncoding: 'utf8'.
stream := codec decoderFor: (GsFile open: '/Users/mariano/test.txt' mode:
'r' onClient: false).
stream next

There you will get the DNU.

Any ideas how to workaround this?

Thanks in advance,





> Johan
>
> On 26 Feb 2015, at 21:22, Johan Brichau <johan at yesplan.be> wrote:
>
> For UTF8, I think you can better use Grease GRUtf8CodecStream
> Let me see if I can extract something from what we did because we do read
> files in several encodings in Gemstone.
> There’s just stream wrappers all over the place… I’m looking into it right
> now.
>
> Johan
>
> On 26 Feb 2015, at 20:56, Mariano Martinez Peck <marianopeck at gmail.com>
> wrote:
>
> Hi Johan,
>
> But do you have any example of using a UTF8TextConvertor with files?
> because I found no way :(
>
> Thanks!
>
> On Thu, Feb 26, 2015 at 4:45 PM, Johan Brichau <johan at yesplan.be> wrote:
>
>> It’s in the PharoCompatibility project on github.
>>
>> Though, I must say it merits some love.
>> It works (using it for years already) but we have sometimes done some
>> hacks to make it work with the Stream classes of GS. Or at least that’s
>> what I remember right off the top of my head.
>>
>> But I have not used this with SIXX. My last experience with SIXX was when
>> we moved the Vooruit database from Pharo+GOODS to Gemstone and I used the
>> commitOnAlmostOutOfMemory trick.
>>
>> Johan
>>
>> On 26 Feb 2015, at 18:46, Dale Henrichs <dale.henrichs at gemtalksystems.com>
>> wrote:
>>
>>
>>  What package is that class located in?
>>
>> Dale
>>
>> On 2/26/15 9:42 AM, Mariano Martinez Peck wrote:
>>
>>
>>
>> On Thu, Feb 26, 2015 at 2:31 PM, Dale Henrichs via Glass <
>> glass at lists.gemtalksystems.com> wrote:
>>
>>>  Mariano,
>>>
>>>
>>> Hmmm, this is a bit of a sticky wicket ....
>>>
>>> I'm afraid that the best way to solve this on is make a major change to
>>> SIXX and force all output and input to be utf8 ... but if you are using
>>> SIXX to move data between pharo and gemstone, then you'll have to make sure
>>> thatSIXX on the pharo size will properly decode utf8 ...
>>>
>>> Alternatively, we could try porting the the whole TextConverter scheme
>>> to GemStone ...
>>>
>>>
>>  Hi Dale,
>>
>>  This is already ported, Johan did it. I just don't know how to use the
>> MultiByteBinaryOrTextStream (to which I can set a #converter:) together
>> with a GsFile backend.. but i guess Johan did something because I cannot
>> imagine he did everything in memory, right?
>>
>>
>>
>>>  I suppose it's about time we did something in this area ...
>>>
>>> Dale
>>>
>>>
>>>
>>> On 2/26/15 7:03 AM, Mariano Martinez Peck via Glass wrote:
>>>
>>>  Hi guys,
>>>
>>>  I am trying to implement the solution provided by Dale for exporting
>>> and importing large objects with SIXX:
>>> https://github.com/glassdb/SIXX?files=1
>>>
>>>  In his example he does:
>>>
>>>   strm := WriteStream on: String new.
>>>  #( 1 2 3) sixxOn: strm persistentRoot: (UserGlobals at:
>>> #'MY_SIXX_ROOT_ARRAY')
>>>
>>>  That stream 'strm' is in memory. I need files. And I want those files
>>> to be encoded with UTF8. In addition, in my experience, I have been trying
>>> to use GsFile as much as possible since it was way faster than other
>>> classes when I tested it. So...so far I was using the following approach to
>>> write a UTF8 file:
>>>
>>>   file := GsFile openWrite: aFilename.
>>>  file nextPutAll: aString encodeAsUTF8.
>>>
>>>  However, I cannot use that approach in the SIXX scenario. Why? Because
>>> I cannot easily hook in the parts where sixx gets the string of an object
>>> and writes it to the stream. So I kind of need to create the File stream
>>> with UTF8 from the beginning.
>>>
>>>  I do have UT8TextConverter, but GsFile dnu #converter:. I tried:
>>>
>>>  | stream |
>>>  *stream := MultiByteBinaryOrTextStream on: (GsFile openWrite:
>>> aFilename).*
>>> * stream converter: UTF8TextConverter new. *
>>> * stream text. *
>>>   MCPlatformSupport commitOnAlmostOutOfMemoryDuring: [
>>>     UserGlobals at: #'MY_SIXX_ROOT_ARRAY' put: Array new.
>>>         #( 1 2 3) sixxOn: stream persistentRoot: (UserGlobals at:
>>> #'MY_SIXX_ROOT_ARRAY')
>>>   ].
>>>   stream close.
>>>
>>>  But it doesn't work. Ok..I did see GsFile >> contentsAsUtf8   so I
>>> could write all the file first, then grab the contents as UT8 and then do
>>> what I did above (a new file doing #nextPutAll: of the UTF8). But...since I
>>> am doing all this code because the object graph I am trying to serialize is
>>> big I am afraid I will run out of memory while trying to have all the
>>> contents as UTF8. So I would really like the "streaming" possibility.
>>>
>>>  Any ideas how can I do that?
>>>
>>>  Thanks,
>>>
>>>
>>>
>>>  --
>>> Mariano
>>> http://marianopeck.wordpress.com
>>>
>>>
>>>  _______________________________________________
>>> Glass mailing listGlass at lists.gemtalksystems.comhttp://lists.gemtalksystems.com/mailman/listinfo/glass
>>>
>>>
>>>
>>> _______________________________________________
>>> Glass mailing list
>>> Glass at lists.gemtalksystems.com
>>> http://lists.gemtalksystems.com/mailman/listinfo/glass
>>>
>>>
>>
>>
>>  --
>> Mariano
>> http://marianopeck.wordpress.com
>>
>>
>>
>>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>
>
>


-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150302/c01b1681/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2015-03-02 at 10.56.45 AM.png
Type: image/png
Size: 161931 bytes
Desc: not available
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150302/c01b1681/attachment-0001.png>


More information about the Glass mailing list