Skip to content

feature request: unicode in source #1406

Open
@tpapp

Description

@tpapp

Introduction

Some languages now support Unicode (mostly UTF8) for writing source code. It would be great if one could also use Unicode in Stan source. (Note that comments in UTF8, or any superset that embeds ASCII, are already supported in the sense the parser just ignores them.)

Broadly, there are two possible levels of support:

  1. in variable and function names (eg ϕ), and
  2. in operators (eg ), which provide synonyms for existing ones (eg <=)

Example

This is how the 8 schools example would look like in unicode:

data {
  int<lower=0> J;             // number of schools
  real y[J];                  // estimated treatment effect (school j)
  real<lower=0> σ[J];         // std err of effect estimate (school j)
}
parameters {
  real μ;
  real θ[J];
  real<lower=0> τ;
}
model {
  θ ~ normal(μ, τ); 
  y ~ normal(θ, σ);
}

Possible benefits

  1. more compact source code
  2. better mapping to equations in papers

Possible downsides

  1. editor/entry support
  2. font support
  3. possibly corrupted files

The first two are mitigated by the fact that ASCII is a subset of UTF8, so using the feature is optional.

UTF8 support in various languages which have interfaces for Stan

language literals identifiers operators would UTF8 variables work for interfacing with Stan?
R yes yes no yes
Python yes only from version 3 no yes, even in Python 2, as they are used as literal keys
Julia yes yes yes yes
Matlab yes yes, but needs to be enabled no yes
Stata yes yes, from version 14 no probably?

Editor support

Emacs

See this list for various UTF8 implementations using autocomplete, company-mode, and quail.

See also

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or requestparsingissues related to the parser and syntax errors

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions