About Those Slangs…

I like the idea of slangs. They let you modify the grammar of Perl 6 (or maybe you’d prefer Q, or perhaps Regex?). This means you could easily switch into a more pythonesque style of programming just by importing a module, such as use Slang::Python.

However, not only are there issues with how slangs currently work (or are at least spec’d to), but they, at least I think, are unnecessary.

The current problems:

  • They augment by default : the intended use of the ‘slang’ keyword makes it so that you have augment-like behavior without actually using the augment keyword, which means you globally affect the meaning of, say, Perl6::Grammar as soon as the slang statement is interpreted.
  • No explicit way of including actions : the slang keyword is essentially another way of writing a bunch of grammar rules. What the mechanism doesn’t come with is any way to include actions, which those of you writing Perl 6 grammars are very used to by this point. To be fair, inline code blocks can do the exact same thing, but this lack of an explicit mechanism is indicative of a general forgetfulness on the importance of actions to a grammar :) .

Those things are fixable, but there’s a larger issue at play: modifying any grammar after the fact is hard, for something like Perl 6 it’s daunting. There is no standard set of rule names for an implementation of the grammar, and how to implement the actions of those rules are much harder to standardize (because not everyone will use QAST blocks in their implementation). So this would a implementation-dependent endeavor, both in terms of supporting multiple implementations, and in hoping said implementations don’t break their grammar/action definitions on you.

So, because of how hard it is in general to modify a grammar, I feel that the primary purpose of slangs should be to introduce a new sublanguage, since a new grammar is easy to do :) . I’ve said as much before.

However, I’ve recently realized something: the slang keyword offers no benefits over grammars and actions, at least not in its current form.

With everything at my disposal but slangs, including being able to interact with things like Perl6::Grammar directly, how would I implement a new sublang? Here’s an idea:

use MONKEY_TYPING;

grammar Skylang::Grammar {
    regex TOP { ... }
    ⋮
}

class Skylang::Actions {
    method TOP($/) { ... }
    ⋮
}

augment grammar Perl6::Grammar {
    rule statement_control:sym<SKYLANG> {
        <sym> '☃' ~ '☄' $<srctext>=(<-[☄]>+)
    }
}

augment class Perl6::Actions {
    method statement_control:sym<SKYLANG> {
        make Skylang::Grammar.parse(~$<srctext>, :actions(Skylang::Actions)).ast;
    }
}

(Yes, this would be implementation-dependent too (e.g., on rakudo the ast from Skylang would need to have a bunch of QAST blocks); I never said that was a problem unique to slangs :P)

Plain ol’ modifying the language would involve just the augments. Additionally, depending on what macros end up doing, the above augmentations could be replaced with a single macro declaration.

My point here is that I think Perl 6 already has what you need to modifying the very grammar of the language itself. If we can work out what slangs (and those $~ variables) could be that isn’t just a synonym for grammar, then I’d be all for it. However I can’t presently think of what slang would do differently. The only thing that comes to mind for me is a way of better linking a grammar and its actions together (though that would benefit grammar too, and you can already do it by redefining the parse method (and maybe its friends) in a grammar anyway).

The more interesting question is how to modify the parsing of Perl 6 in an implementation-independent fashion. The grammar side can be helped by standardizing the rules of the grammar, essentially Perl 6′s readable version of a BNF grammar definition in other language specifications. Whether or not the grammar should be standardized at all is another matter though :) .

The actions side, an independent AST specifier, might be far more tricky. But if I understand things correctly, quasi does that for us already.

So I don’t know if we can fashion slang into a far more distinct (and hence useful) keyword than its friend grammar, or if we don’t need it after all, but it’s certainly an interesting thought.

Of course, this is all complicated by the fact that “we must be fairly certain what we want, and we aren’t yet :)”. There’s so little specification of slangs that the point of this post (“slang is useless”) is in all likelihood just plain wrong. Allowing the user to rewrite the language they’re using with the language they’re using is a hard thing to do, and it’s no wonder that all efforts have been placed on everything else so far. But it has led to at least me thinking slang is unnecessary, and if I’m to be proven wrong, it needs to be done soon :P .

Maybe we need a “metalanguage”, like the “metamodel”… would that just be NQP?…

Posted in Think Tank | Tagged , , , , | 1 Comment

Perl 6 and CPAN? Well…

So, just earlier today the issue of putting Perl 6 modules (and other assorted things) on CPAN came up. I feel that it’s pertinent to put all of my concerns with this now, instead of waiting until people are in the middle of actually implementing this.

Why not just put the modules on CPAN and be done with it?

Sure! Let’s go ahead and right now upload some of the more well-known Perl 6 modules, like File::Find, Shell::Command, and JSON::Tiny.

Oh.

Yeah, that’s not happening. Perl 6 and Perl 5 are incompatible languages, at least enough so that sharing one universe of module names is absurd.

Why not just prefix Perl 6 modules with Perl6:: ?

You mean Perl 6 module-writers do this for each and every one of their modules? No.

You mean CPAN puts a fake Perl6:: in front of modules for organizational purposes? Alright. Sure hope no existing CPAN modules use Perl6:: as an actual namespace.

Oh.

We could implement all the workarounds we want, but really it would be best for everyone’s sanity if we just maintained separate Perl 5 and Perl 6 worlds.

Other issues

OK, so why not just separate the two worlds on the CPAN servers? Fine, but there are other issues that then come up in the process.

PAUSE needs new scanning tools

Namely, PAUSE currently check a tarball’s .pm files for packages given, which clearly won’t work with Perl 6. A script that analyzes an S11-compliant META.info would suffice here.

CPAN needs more metadata, and be more like typical package managers

Have your own, non-CPAN bug tracker? Have a repo that contributors can, uh, contribute to? CPAN’s current solution is to “check the documentation of the module”. I believe this is unacceptable, especially considering how atypical (wrt CPAN) Perl 6 does module distribution. CPAN needs to have “source code” and “bug tracker” links that point to the right place.

In fact, I’d prefer it if CPAN were more like various package manager sites for various Linux distros. That is, more than a place designed to give free tarball hosting space to Perl developers. It should at the very least provide a standardized, metadata-based external link to some sort of homepage.

This brings up a more general issue: S11 and related are designed to specify a full package management system, something much closer to those OS package managers. Granted, I am not at all familiar with Perl 5, much less CPAN, but it just feels like a bare-bones “only what’s needed to install modules easily” kind of thing, which Perl 6 goes beyond. CPAN was built around how Perl 5 does packages; how Perl6 does packages is designed around what package managers typically do.

The most prominent difference between how Perl 5 works with packages and how Perl 6 works with packages is in authority and versions. CPAN and PAUSE are responsible for handling versions, Perl 5 does not handle this. Additionally, module names are owned by one (or more) people, specified by the same infrastructure.

In Perl 6, the version and authority are a part of the package itself. It makes little sense to place restrictions on what names you can use, or have a “upload tarballs only once” policy that requires versioned tarballs.

The versioning shouldn’t be too hard to fix; most tarballs tend to be versioned anyway, but with version info in the module, instead of near it, PAUSE can’t rely on versioning of tarballs anymore, at least not for Perl 6.

The author part is harder; since anyone can create a module with an existing name, so long as they aren’t the same author, this destroys the idea of various people “owning” a particular module name. Where in CPAN I have to explicitly request the ability to update Shell::Command from the right sources, in Perl 6 I can just make a module with that name, that holds the updates (a.k.a. “forking”). I imagine this isn’t easily fixed unless CPAN/PAUSE6 are effectively totally separate from the 5 versions.

Finally…

So, with the need to maintain separate worlds for Perl 5 and Perl 6, and to significantly alter CPAN itself (esp. the interfaces) for the kinds of things Perl 6 is designed for, the question arises: why doesn’t panda and the ecosystem work well enough already? Sure, CPAN offers a nice place to host tarballs, but aside from that, I think the existing infrastructure that Perl 6 has works. From my view, this would be nothing more than a name change, one that’s of questionable value.

Just to be clear, I’m not totally opposed to a move to CPAN (in fact it would likely give the Perl 6 crowd some much-needed structure in their module distributions). I just have some serious misgivings about what this move would entail, and on some level why putting it all under the CPAN name is better than just putting it under a different name, especially if the two languages would be so separate.

However, I would love to be convinced otherwise, that CPAN would be awesome for Perl 6. This is simply the opinion of someone who’s used cpan all of once or twice for the odd Perl 5 script that needs to be run, and has been able to get along in Perl 6 just fine without CPAN so far.

Additionally, because I believe in lighting candles when it’s dark, I’ll do my best over the next few days to design a mockup of my idea of a Perl 6 package manager, to better illustrate why I’m not sold on CPAN as it is. (Yes, this will most likely revive one certain idea, if maybe not quite in name :P)

Also, mostly as a matter of principle, I refuse to use PAUSE until I’m not forced to give out my full real name. I just don’t see the point of that, and I’m not very liberal with any of my personal information unless it’s absolutely necessary :) .

Posted in Think Tank | Tagged , , , , , , | 1 Comment

A Brand New Spec, S15

So, for the past few days I’ve been working on a provisional S15 mostly for fun. I was considering TimToady’s long-ago suggestion of developing a libicu replacement tuned to Perl 6′s needs, and after learning some interesting things about NFG, I finally got around to writing an S15.

After those few days, S15 has become “good enough” for inclusion into the specs repository, where it will benefit from many people being able to edit the spec. Now anyone with commit access to the specs repository will be able to improve it, as well as anyone who forks the repo :) .

See it here.

The contents of S15 are far from finished. There’s a lot of stuff that still needs working out, such as the functions of the Stringy and Unicodey roles, whether Uni is a rope of multiple Normalization Forms or just a simple string containing that mixture, and the function of string operators now. For instance,

Str ~ Str

Concatenates two strings and results in a Str. But what happens when you try

Uni ~ NFC
NFKD ~ NFKC

or any of the other multitude of combinations of string types?

What’s Next?

There are three things I see that I could do at this time:

  1. Write and fudge a bunch of S15 tests. This seems to me to be the most important thing, as it allows us to see how coding with these new things feels before they ever begin to work.
  2. Copy a bunch of S15 information to the rest of the spec. This involves at least, off the top of my head, S05, S32::Str(ing), and S02. Undoubtedly more.
  3. Start migrating the other specs to Pod6. The S15 I placed in the repository makes it the second Pod6-written document in the specs repository. I should think that now’s a good time to migrate the rest of the specs, and modify/replace the relevant scripts in the mu repository to handle Pod6. All this work would of course happen in branches.

The list is in about the order I plan on doing these things, assuming others don’t work on these things first :) .

So please, read our not-yet-stellar provisional draft S15, and get ready for the Unicode Future™.

Posted in Press, Progress Happened | Tagged , , , , , , | 2 Comments

Some Thoughts on Unicode in Perl 6

All of the recent work on Rakudo, getting it to run on the JVM, and the creation of and work on MoarVM as another backend for NQP (and thus Rakudo), has created a sense that we’re really moving forward in Perl 6-land. Maybe Christmas will come this year, or perhaps 2014?

In any case, with Rakudo now on a more mature platform, to be able to implement the big things (such as threads), it seems as though Rakudo is making big leaps towards being fully Perl 6.

Except that actually cannot happen, what with 8 unwritten synopses:

  • S15 — Unicode
  • S18 — Compiling
  • S20 — Introspection*
  • S23 — Security
  • S25 — Portable Perl
  • S27 — Perl Culture*
  • S30 — Standard Perl Library
  • S33 — Diagnostic Messages

And that’s not counting all the other synopses that need a serious rewrite (the higher the spec number, the more likely it’s in need of repair). With the momentum currently going forward in the community, perhaps it’s time we use some of that to fill out the rest of the specification?

If you haven’t guessed already, the spec I’ve been thinking about is S15. Below is a presentation of some notes on the subject I put up a couple of days ago.

Consider this humble Devanagari syllable:

नि (U+0928 U+093F)

Next to it you see the two codepoints that make it up. I shall now present a table on how the various UTFs encode this syllable:

UTF-8 E0 A4 A8 E0 A4 BF
UTF-16BE 0928 093F
UTF-32BE 00000928 0000093F

When it comes to Unicode there are a number of ways to count characters, depending on your view of the situation. Here’s a quick list, from lowest to highest view:

  • Bytes are a simple count of the number of bytes that make up the given Unicode text.
  • Code units are the smallest units of information in an encoding. The number after UTF indicates the number of bits in a code unit (so the code unit of a UTF-8 text is the byte, 8 bits).
  • Code points are the numbers assigned to each “character” in Unicode. This is independent of encoding. The Devanagari syllable above has two code points.
  • Graphemes are what normally constitute a character to the reader’s eyes, regardless of how many code points make it up. Both ä and ä are just one grapheme, even though the first one is made up of two code points.

For our Devanagari syllable above, the counting based on viewpoint and encoding is outlined here:

(count by) UTF-8 UTF-16 UTF-32
bytes 6 4 8
code units 6 2 2
codepoints 2 2 2
graphemes 1 1 1

As you’ll notice, the counting of codepoints and graphemes is not affected by the text’s encoding. (Also note that the endianness of UTF-16 and UTF-32 doesn’t matter when it comes to counting.)

Perl 6 has some ideas about Unicode already set, such as counting by graphemes by default (which counts a string containing just our Devanagari syllable above as 1 long, which is what you usually mean).

What I’m putting here today are some of my ideas on what the methods and pragmas involved should look like. I’ve yet to think about Str and Buf specifically (questions such as their relationship with each other and whether more-derived types of Str/Buf (e.g. StrGraphemes) are necessary or useful). There’s hardly enough here for a decent S15, but hopefully enough for a starting point.

Pragmas — Changing Defaults

Perl 6 handles Unicode in a couple of default ways:

  • Encodes Unicode strings in UTF-8
  • Views strings in terms of graphemes unless another view is requested

These defaults should pervade any time you’re dealing with text, whether it’s a literal string, user input, or non-binary file I/O. You can always change these, such as .codes to count the string by codepoints, or open("file", :enc<UTF-16BE>) to open a text file you know is encoded as UTF-16BE.

But if you’re dealing with a lot of UTF-32LE encoded files, or you need to a lot of string operations at the code unit level, then pragmas are the way to change these defaults. Here are the pragmas as I imagine spelling them:

use utf8;          # use UTF-8 encoding
use utf16 :be/:le; # use UTF-16[BE|LE] encoding
use utf32 :be/:le; # use UTF-32[BE|LE] encoding

use graphemes;  # count by graphemes
use codepoints; # count by code points   
use codeunits;  # count by code units
use bytes;      # count by bytes

There is also one other pragmas I’ve thought up, although its usefulness is very questionable:

use normalization :NFC/:NFD/:any
# compose/decompose/leave be all characters
# in strings at time of creation.

Methods for Str

These methods either count characters in a certain way, or (de)compose them, or change the encoding of the Str. Here’s the list, some of these already specced:

.chars  # count by the current default view (default .graphs)
.graphs # count by graphemes
.codes  # count by code points
.units  # count by code units
.bytes  # count by bytes

.compose   # convert string to NFC form
.decompose # convert string to NFD form

.convert # change the encoding of the Str

There’s likely a host of other functionality that Strs need, but these are the ones that have come to mind.

Closing Thoughts

I don’t think Perl 6 needs a separate type for a single character (the Char and AnyChar found in the untouched corners of the spec). It feels like an unnecessary addition; it’s hard for me to see a time where a one-character string needs to be treated differently from a multichar string.

Also a couple times in the spec, is the idea of counting characters with adverbs such as :ArabicChars in addition to :graphs. I’d like to see examples of scripts where a “grapheme” is not always the same as a complete “character” before going along with such language-specific counting mechanisms in core.

I also feel that Buf needs better explanation. I’m thinking about it now, and I suppose I need some convincing that we need to consider Buf the cousin of Str. I think it’s useful to have a type of array that’s designed to work with binary files (something I feel Buf is perfect for), but I have doubts about treating it like a numbers-based look at Str.

To put it another way, I’ve always used, and thus see, Buf as a way of interacting with binary files, and its data. I have a hard time believing such an object should be tasked with text-based knowledge, such as if it’s a valid Unicode string, when that may not be the case.

(Although derived versions of Buf, such as Utf8, could perfectly place text-based restrictions on its data. But leave Buf out of it. :) )

Finally, I hope that we can soon get a decent S15 written, and then maybe also finish the rest of the spec. How ’bout it?

* To be fair, there are two drafts, one for S20, and S27. They’re available from the front page of the HTML specs, the A20 draft (yes, an apocalypse draft) and the S27 draft. The A20/S20 draft might be worth a look and combining with what jnthn’s debugger does. The S27 draft, in my opinion, should be ignored without a second thought.

Maybe there should be an S34 for 6model, making it nine unwritten.

Posted in Think Tank | Tagged , , , | 5 Comments

A new Perl 6 major mode for emacs! (In Progress)

I’ve finally gotten fed up with cperl-mode, because it has now decided to throw lisp errors upon typing things like : or }, anything that would make use of its electric feature (which I couldn’t find how to turn off fast enough), so now I decided to take an earnest effort in creating my own perl6 major mode.

Ta-da!

Right now this mode highlights "" and '' strings (because emacs provides string and comment highlighting for you; I have yet to set up comments), as well as sigilled variables with identifier names and also $_, $!, and $/. There’s also an indentation code that works well, but the rules that govern it will certainly need refinement over time as more specialized rules are discovered :) .

If you use emacs, I invite you to try it out. The easiest way to use it is to open it in emacs, type M-x eval-buffer, and then open a Perl 6 file and type M-x p6-mode . The only part that will likely be bug-report-worthy* is going to be when the indentation code doesn’t work right. Issues with highlighting are not worthwhile to report until there’s quite a bit of highlighting support :) .

I hope that in the coming weeks/months this will turn into a fully-functional Perl 6 major mode. It will also, after full Standard Perl 6 support is done, be extensible due to the highly malleable nature of Perl 6 parsing. (see the repo’s README for more on this)

Have the appropriate amount of fun!

*Lisp errors that throw you into a debugger and turn all of your modelines from saying (Fundamental) to [(Fundamental)] are of course another problem to report. If that happens, follow these steps (or else that [(Fundamental)] thing, called recursive edit mode, will possibly never go away until you restart emacs):

  1. Select the entire error (with your mouse or C-x h)
  2. Copy it (M-w)
  3. Exit the debugger (q)
  4. Paste in another buffer before you lose the error, *scratch* is a good choice (C-x b *scratch* C-y)
  5. Report the issue to https://github.com/lue/p6mode/issues, preferably with the error message you just copy-pasted :) .
Posted in Progress Happened | Tagged , , , , | Leave a comment

The Rakudo Codebase: Visualized! (Partially)

Yesterday I stumbled upon this old perl6-compiler mailing list message which inspired me to actually try to split up the compilation of CORE.setting into smaller, saner pieces. I’ve started work on this already (so far having just modified the Makefile).

The one and only response to that post suggested using the stub syntax to resolve missing class issues. Sadly, that does not work, because perl6 needs to see the stubbed class in full sometime later in the file.

So, the best option I’ve seen is to include the files containing needed classes and roles into files that need them (through either use or require). This requires knowing what to include (as including everything everywhere ruins the point), and I’ve started to try to grasp what’s going on in the rakudo codebase, specifically src/core, in terms of dependencies.

Here’s what I have so far. It’s right now just one, single script. This single script graphs the inheritance chain of all the classes and roles in src/core (what classes and roles “is” and “does”). You need the modules Term::ANSIColor, Term::ProgressBar, and IO::Capture::Simple (at this time pulled in, but apparently unused, by Term::ProgressBar). You also need ack and graphviz (which is not used in the script, but needed to process the resulting core.dot file).

Wanna see the graph for nom commit bf472b0, the latest as of this writing? Here you go (click to embiggen, clearly):

The arrows point towards what a particular box is inheriting, and the one or two white boxes mean that the class or role doesn’t exist within the src/core files. Yellow denotes classes, red denotes roles. And each box contains the name of the class/role, and the file it was found in and the line number of its definition.

This, however, doesn’t even begin to deal with all the dependencies abound in the src/core files. See all those abandoned colored blocks? Those aren’t inherited by anything in src/core, but they most certainly are utilized (e.g. the Pod::Config block, though not inherited, could easily be utilized as a variable type in the Pod code). I have a lot more work to do before I have a clear picture on how to separate CORE.setting into smaller pieces. (I might even need to utilize STD.pm6 or similar to do all the parsing I’ll eventually be doing!)

Just to finish off, Here are a couple more graphs of the same data. The first graph above was generated with GraphViz’s “dot”. This one (generated by their “twopi” program) I think looks hilarious:

And this picture was generated by GraphViz’s “fdp”. I think this one is the easiest to follow.

Happy viewing!

Posted in Progress Happened, Research Department | Tagged , , , | Leave a comment

Making an IF game in August

After masak is done with his July of blogging and separates Adventure::Engine from his game, I’ll be taking it and during the month of August crafting a game with it, adding various improvements to the engine. Here’s a list of the general things I’ve planned so far.

Note: this list may change depending on what masak does in the last couple days of his blogging month.

  • Modify Adventure::Engine to implement game objects as objects
  • Grammar/Actions for the game commands.
  • NPCs!
  • in-program descriptions
  • Curses!

Let me explain:

OOP’d Objects

This I’ll most likely be doing first. While masak’s Adventure::Engine this year is great, it doesn’t use objects for the in-game objects. Instead, it uses hashes and lists to keep track of the features of various objects. I personally would like to group information about objects into, well, objects. I will, however, avoid becoming too class-happy like masak’s solution to an adventure game last year (making every game object its own class is just a wee bit class-happy in my opinion).

I’d accomplish this either by taking Adventure::Engine and having a local copy in the repo to modify, or by using augment and/or supercede in my own code, which would be a good opportunity to show off those rarely-used keywords.

The one problem I’d face is that it’s harder to have lists of objects when you’ve gone OOP. It’s (AFAICT) much easier to get a list of all the objects in the game in masak’s current Adventure::Engine. Oh well, I spy some binding will come into play \o/.

Grammar for commands

This is just a small little thing for me. I’ll have a Grammar and Actions turn the input on the commandline into something nicer to handle in the rest of the code.

NPC characters

Even though in any normal IF game I’d write, I’d avoid a story that needs any NPCs, I figured I’d implement NPCs and after August release it as, say, Adventure::Engine::NPC (depending).

Descriptions in Code

This is a non-item almost, I’ll just modify Adventure::Engine to have descriptions in-game.

Curses!

Because I want a status line at the top (like ye olde Infocom), why not (n)curses? If there is no existing Perl 6 module to interface with curses, I’ll write one (NativeCall GO!). I considered writing a module (if there is no existing curses module) that uses a laborious amount of print statements and \e characters, but I’ll try to interface with ncurses first.

Although I don’t know if I can do a month of blogging, I’ll try to fit it all within the month of August.

Posted in Uncategorized | Leave a comment