Rixstep NSA

Share this post

Do Be Do Be Do Too

rixstep.substack.com

Do Be Do Be Do Too

A new iteration.

Rixstep
Mar 8
Share this post

Do Be Do Be Do Too

rixstep.substack.com

This is again Be. The takeaway this time is that finally you're no longer bound by extensions to determine encoding (and thereby loading files).

To review encodings.

Check the File-Encoding menu on either Be or Rixedit. Those menus are built dynamically at runtime. (TextEdit has an anaemic version of this, offers only a few of the encodings available.)

Those are the character encodings available on your machine.

The encoding is used both to store files in digital form and to read them back. Naturally it's crucial to use the same encoding on both input and output.

UTF-8

One of the most often incurred quagmires comes with UTF-8. UTF-8, as you perhaps may remember, is a work of at least near genius by Unix creator Ken Thompson when sitting at a cafeteria in Murray Hill with Rob Pike. They were on a pressing timeline for IBM, and Ken got the idea and sketched it out on a paper placemat.

UTF-8 makes it possible to store anything (UTF is basically anything) in single-byte format, so to speak. So-called 'escape sequences' are used to signal non-ASCII characters coming.

The default character encodings on different hardware platforms also differ.

For whatever reason, the default Apple encoding (Western Mac OS Roman) and Unicode (UTF-8) mixed together can stick it to you. This is usually sorted by exiting the application, then making sure the extension doesn't mandate a specific encoding, then trying again. This isn't necessarily the end of it, as you can find that UTF-8 sequences don't appear as they should. This is hopefully remedied by Be's latest update.

Starting now, Be will read the character encoding before attempting to read the file itself, even though the sequence of directives can be the opposite.

But we seem to have grokked it.

The way this is done is by putting the encoding on the 'backside' - in the proprietary XA (extended attribute).

Apple's TextEdit has dabbled in this, but not extensively and not, what we saw, through standard binary XML XAs. And Rixstep's Be goes further and does it cleaner anyway.

Here are the config settings Be stores.

<key>Be</key>
<dict>
    <key>Align</key>
    <key>Back</key>
    <key>Code</key>
    <key>Font</key>
    <key>Fore</key>
    <key>Frame</key>
    <key>Range</key>
    <key>Spell</key>
</dict>

The important thing here is 'Code'. That's the encoding.

Starting with this release, Be will, even if otherwise prompted, get the stored encoding before loading the file - as, to load a file, you must irrevocably know the encoding first.

This was taken care of with Rixedit by specifying the encoding by extension, with the extension listed in Info.plist.

Starting with this release of Be, this is no longer necessary.

Starting now, Be will sport only its proprietary extension 'be' and, again for laughs, 'dobedobedo'.

Be's Info.plist will be pruned over time.

Note as well that the same advantage applies to any file that Be stores. So you're not limited to extensions to have this work.

Note too that this Be is still not the final Be. The code we now have is looking good - streamlining and organising are good things - but this app breaks new ground, so we have to tread cautiously.

Thanks.

Changelog

2023-02-24
Be. Adding 'be8' to Info.plist eliminates anomaly but it shouldn't be needed. Review init code.
2023-02-24
Be. ReadFrom wants us to read in before we've read the XA. Can we not take the NSData and wait?
2023-02-25
Be8 works but there's no reason to configure twice.
2023-02-25
Above scuppered.
2023-03-02
Be. Testing streamline version. Very. Carefully. Loading files in new windows is a two-step process. That's just it. You're asked to get the file first, then you're told the window's coming up. The trick is to read the encoding on the 'backside' in phase one. The redundancy we have is the same sub-method's called twice (but with different arguments). Image size as before.

It seems the 'UTF8' issue will always exist, but this new Be technology opens for further things.

Now to test for a few weeks!
2023-03-02
Be. Slight optimisation. Same object size but should run init a bit faster - not that we need more speed, but still and all.
2023-03-04
GD. Checking up. GD is fine, but the OS underbody seems to at times need an infusion of new URLs (or something) to kick in.
2023-03-04
TempEdit doc update.
2023-03-04
Be. Can now remove configs specific to UTF-8 as encoding is read before file is read. Further integration planned. Ultimate goal is to find a place for a menu item for 'Save Config' and do it without encumbrance on the app. So you'll have 'New With Config...' and this leads to a popup where you get to choose from configs you've created. You'd also be able to delete configs from the list. That's the goal anyway. We're already 'functional' but not yet 'lazy-ass functional' which is the goal.

Lullabies

Share this post

Do Be Do Be Do Too

rixstep.substack.com
TopNew

No posts

Ready for more?

© 2023 Rixstep
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing