PROGRAMMING LANGUAGES LANDSCAPE
OLD & NEW IDEAS
Ruslan Shevchenko
VertaMedia/ Researcher
ruslan@shevchenko.kiev.ua
https://github.com/rssh
@rssh1
PROGRAMMING LANGUAGES LANDSCAPE: OLD & NEW IDEAS
What tomorrow programming will be like.
Languages
Complexity
Hardware
Worlds
Learning Curve
Expressibility
Layers
ASM
PASCAL
BASIC
JAVASCRIPT
ASM
C/C++
TCL
JAVA
C
RUST
SCALA (?)
JS;JULIA(?)
CREOL ENGLISH ?
???
QUANTUM ???
2000 - 2020
80 - 2000
2020 - 20XX
J* ??
20XX - YYYY
COBOL
Hardware
1 Processor Unit
1 Memory Unit
1 Machine
N Different Processors (CPU, GPU, NTU, QTU)
N Different Storage Systems (Cache, Mem, SSD, ..)
N Different Machines
PL: Main Language Constructs:
still execution flows
Memory Access Evolution:
Fortran 57 : static memory allocation
Algol 60 : Stack
Lisp 58: Garbage Collection
BCPL, C [70] - manual
ObjectiveC [88] — reference counting
Java [95] — Garbage collection become mainstream.
Rust [2010-15] — compile-time analysis become mainstream
C++ [..88] — manual + destructors
Algol 68: - Stack + manual + GC
Smalltalk [72-80] — GC (RC as GC optimization)
Objective C++
ML [70] - compile time analysis
Simula 67
// not all, not main
Memory Access Evolution:
Manual allocation: Risky, low-level
Garbage Collection: Generally Ok, but pauses:
not for Real-time systems
not for System-level programming
Type analysis [RUST]
subculture of sun.misc.Unsafe
in java
RUST: ownership & lifecycle
T - object of type T (owned by code in scope)
&T - borrowed reference to type T (owned not by us)
&’L T - reference to type T with Lifetime L
mut T - mutable object of type T
* T - row unsafe pointer
let y: & str
{
let email = retrieve_email(….. )
let domain = first_entry(email,”@“)
y = domain
}
// not compiled, lifetime of y is in outer scope.
fn first_entry(value: &’a str, pattern: &’b str) -> &’a str
RUST: general
Next step in low-level system languages.
Zero-cost abstraction + safety
more difficult to write in comparison with GC lang.
fast and easy in Comparison with C [may-be C++]
Alternatives:
advanced GC [go, D, Nim ]
Concurrency Models Evolution:
Fortran 57 : one execution flow
PL/1 64 : Multitasking API
1972: Actor Model
1988: Erlang [ Actor Model implementation]
1978: CSP Model
1983: Occam [1-st CSP Model Implementation]
1980: Implicit parallelism in functional languages (80-1)
1977. Future [MultiLisp]
2007: Go (CSP become mainstream)
2010: Akka in Scala (Actor Model become mainstream)
2015: Pony [actors + ownership]
// not all, not main
Concurrency Models:
Callbacks: [manual], Futures [Semi-manual]
hard to maintain
Actor-Model (Active Object)
CSP Channels; Generators
Async methods.
lightweight threads [coroutines, fibers .. ]
execution flow ‘breaks’ thread boundaries.
Implicit parallelism
hard to implement, not yet in mainstream
Actor Model:
// skip on demand
CSP Model:
// skip on demand
Async/Transform (by compiler/interpreter):
def method():Future[Int] = async {
val x = retrieveX()
val y = retrieveY()
x+y
}
def method():Future[Int] = async {
val x = await(retrieveX())
val y = await(retrieveY())
x+y
}
class methodAsync {
var state: Int
val promise: Promise[Int]
var x, y
def m():Unit =
{
state match {
case 0 => x = retrieveX onSuccess{ state=1; m() }
case 1 => y = retrieveY on Success { state = 2; m() }
case 2 => promise.set(x+y)
}
}
Concurrency Models / current state
Problems:
Data Races. Possible solutions:
immutability (functional programming)
copy/move semantics [Go, Rust]
static alias analysis [Rust, Pony]
Async IO interfaces.
Future:
Heterogenous/Distributed case
Implicit parallelism
RUST: race control
T <: std::marker::Send
— it is safe to send object to other thread
— otherThread(t) is safe
T <: std::marker::Sync
— it is safe to share object between threads
— share = send reference
—- otherThread(&t) is safe
{
let x = 1
thread::spawn {||
do_something(x)
}
}
// error - lifetime of x
{
let x = 1
thread::spawn {move||
do_something(x)
}
}
copy of original
Pony:
Actors
Type Analysis for data sharing.
Pony Type - type + capability
— T iso - isolated
— T val - value
—- T ref - reference
—- T box - rdonly
—- T trn - transition (write part of the box)
—- T tag — identity only
Destructive read/write
fut test(T iso a) {
var ref x = a
}
// error -
fun test(T iso a){
var iso x = consume a // ok
var iso y = a
// error - a is consumed
}
Distributed computations:
Thread boundaries + Network boundaries
Locality hierarchy
Failures
val lines = load(uri)
val count = lines.flatMap(_.split(“ “))
.map(word => (word, 1))
.reduceByKey(_ + _)
Scala, count words:
// Same code, different execution
val lines = load(uri)
val count = lines.flatMap(_.split(“ “))
.map(word => (word, 1))
.reduceByKey(_ + _)
Java, count words:
// Same code, different execution
@Override
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
String line = (caseSensitive) ?
value.toString() : value.toString().toLowerCase();
for (String pattern : patternsToSkip) {
line = line.replaceAll(pattern, "");
}
StringTokenizer itr = new StringTokenizer(line);
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
Counter counter = context.getCounter(CountersEnum.class.getName(), CountersEnum.INPUT_WORDS.toString());
counter.increment(1);
}
}
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
@Override
public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
String line = (caseSensitive) ?
value.toString() : value.toString().toLowerCase();
for (String pattern : patternsToSkip) {
line = line.replaceAll(pattern, "");
}
StringTokenizer itr = new StringTokenizer(line);
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
Counter counter = context.getCounter(CountersEnum.class.getName(), CountersEnum.INPUT_WORDS.toString());
counter.increment(1);
}
}
}
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values,
Java, count words:
// Same code, different execution
Can we do better (?) - Yes [but not for free]
- retargeting stream API (impl.effort)
- via annotation processor
- use byte-code rewriting (low-level)
Java, count words(2):
// Near same code, different execution
List{???}<String> lines = load(uri)
int count = lines.toStream.map(x ->x.split(“ “))
.collect(Collectors.group{Concurrent,Distributed}By(w->w,
Collectors.mapping(w->1
Collectors.reducing(Integer::Sum)))
[distributed version is theoretically possible]
Ideas
Language Extensibility: F: A=>B F: Expr[A] => Expr[B]
• Functional interpreters: Expr[A] build on top of L
• well-known functional programming pattern
• Macros: Expr[A] == {program in A}
• Lisp macroses [1960 … ]
• Compiler plugins [X10],
• Non-standard interpretation of arguments [R]
Reach enough type system, to express Expr[A] (inside language)
Language Extensibility: F: A=>B F: Expr[A] => Expr[B]
Small example (functional compiler)
trait GE[T]
Code(
val fundefs: Map[String, String]
val expr: String,
)
trait GERunner
{
def loadValues(Map[String,Array[Double]])
def loadCode(GE[_])
def run()
def retrieveValues(name:String):Array[Double]
}
// GPU contains OpenCL or CUDA compiler
// available via system API
case class GEArray(name:String) extends GE[Array[Double]]
{
def apply(i:GE[Int]): GE[Double] = GEArrayIndex(this,i)
def update(i:GE[Int],x:GE[Double]): GE[Unit] = GEUpdate(this,i,x)
def index = new {
def map(f: GE[Int] => GE[Double]):GE[Array[Double]] = GEMap(this,f)
def foreach[T](f:GE[Int] => GE[T]):GE[Unit] = GEForeach(this,f)
}
}
case class GEPlus(x: GE[Double], y: GE[Double])
extends GE[Double]
implicit class CEPlusSyntax(x:CE[Double]) extends AnyVal
{
def + (y:CE[Double]) = CEPlus(x,y)
}
case class GEMap(a:GE[Array[Double]],f:GE[Int]=>GE[Double])
case class GEArrayIndex(a: GE[Array[Double]],i:GE[Int])
extends GE[Double]
case class GEConstant(x:T):GE[T]
case class GEVar[T](name:String):GE[T]
val a = GEArray[Double](“a”)
val b = GEArray[Double](“b”)
val c = GEArray[Double](“c”)
for( i<- a.index) {
c(i) = a(i) + b(i)
}
a.index.foreach(i => c(i) = a(i)+b(i) )
a.index(i => GEArrayIndex(c,i).update(i,
GEArrayIndex(a,i)+GEArrayIndex(b,i)))
GEForeach(i =>
(GEUpdate(c,i),
GEPlus(GEArrayIndex(a,i),GEArrayIndex(b,i)))
trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
GEArrayIndex(GEArrayVar(a),GEVar(i)) => “a[i]”
class GEIntVar(name:String) ..
{
def generate():GPUCode =
GPUCode(
defs = Map(name -> “int ${name};”)
expr = name)
}
trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
GEPlus(GEArrayIndex(GEArrayVar(a),GEVar(i)),
GEArrayIndex(GEArrayVar(b),GEVar(i)) =>
“a[i] + b[i]”
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
c.update(i,a(i)+b(i)) => “c[i] = a[i] + b[i]”
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, ci, cy) = (x,i,u) map (_.generate)
GPUCode(defs = merge(cx.defs,cy.defs,ci.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
GEPlus(GEArrayIndex(GEArrayVar(a),GEVar(i)),
GEArrayIndex(GEArrayVar(b),GEVar(i)) =>
“a[i] + b[i]”
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, ci, cy) = (x,i,u) map (_.generate)
GPUCode(defs = merge(cx.defs,cy.defs,ci.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEForeach[T](x:GE[Array[Double]],
f:GE[Int] => GE[T] )
{
def generate():GPUCode =
{
val i = new GEIntVar(System.newName)
val (cx, ci, cfi) = (x,i,f(i)) map (_.generate)
val fName = System.newName
val fBody = s”””
__kernel void ${funName}(${genParamDefs(x)}) {
int ${i.name} = get_global_id(0)
${cfi.expr}
}
“””
GPUCode(
defs = merge(cx.defs,cy.defs,cci.defs,Map(fName,fBody)),
expr = s”${fname}(${genParams(x)})”)
}
}
trait GE[T]
case class GPUCode(
val defs: Map[String,String]
val expr: String
)
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
class GEPlus(x:GE[Double], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, cy) = (x.generate(),y.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double])
{
def generate():GPUCode =
{
val (cx, ci, cy) = (x,i,u) map (_.generate)
GPUCode(defs = merge(cx.defs,cy.defs,ci.defs),
expo = s”(${cx.expr} + ${cy.expr})”)
}
}
class GEForeach[T](x:GE[Array[Double]],
f:GE[Int] => GE[T] )
{
def generate():GPUCode =
{
val i = new GEIntVar(System.newName)
val (cx, ci, cfi) = (x,i,f(i)) map (_.generate)
val fName = System.newName
val fBody = s”””
__kernel void ${funName}(${genParamDef(x)}) {
int ${i.name} = get_global_id(0)
${cfi.expr}
}
“””
GPUCode(
defs = merge(cx.defs,cy.defs,cci.defs,Map(fName,fBody)),
expr = s”${fname}($genParams(x))”)
}
}
for(i <- a.index) yield
c(i)=a(i)+b(i)
=>
defs: “””
__kernel void f1(__global double * a,
__global double * b,
__global double* c, int n) {
int i2 = get_global_id(0)
c[i] = a[i]+b[i]
}
Finally:
val a = GEArray[Double](“a”)
val b = GEArray[Double](“b”)
val c = GEArray[Double](“c”)
for( i<- a.index) {
c(i) = a(i) + b(i)
}
__kernel void f1(__global double*a,
__global double* b,
__global double* c,
int n) {
int i2 = get_global_id(0)
c[i] = a[i]+b[i]
}
GPUExpr(
)
// with macroses can be done in compile time
Complexity
Louse coupling (can be build independently)
Amount of shared infrastructure (duplication)
Amount of location informations.
Typeclasses:
typeclasses in Haskell
implicit type transformations in scala
concepts in C++14x (WS, not ISO)
traits in RUST
A B
B don’t care about AA don’t care about B & C
Crepresentation of A
A B
C
Typeclasses
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
A B
C
Typeclasses
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
{
def generate():GPUCode =
{
val (cx, ci) = (x.generate(),i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
class GEArrayIndex(x:GE[Array[Double]], i:GE[Int])
implicit
object GEArrayIndexCompiler extends Compiler[GEArrayIndex,GPUCode]
{
def generate(source: GEArrayIndex):GPUCode =
{
val (cx, ci) = (source.x.generate(), source.i.generate())
GPUCode(defs = merge(cx.defs,cy.defs),
expo = s”(${cx.expr}[${cy.expr}]”)
}
}
trait Compiler[Source,Code]
{
def generate(s:Source):Code
}
A B
C
Typeclasses class String
implicit object StringComparator extends Comparable[String]
trait Comparable[A]
string
trait Ordered
{
fn less(x:&self, y: &self) -> bool
}
RUST: imp Ordered for string
{
fn less(x:&self, y: &self) -> bool
{
return ….
}
}
Language features:
Research => Mainstream
Type analysis
Lightweights threads/async interfaces
Metaprogramming
Lousy Coupling a-la typeclasses
Mostly research
implicit parallelism
distributed computing
gradual typing
language composition
SE 2016.
3 Sep. 2016
Questions.
Ruslan Shevchenko
ruslan@shevchenko.kiev.ua
@rssh1
https://github.com/rssh
See during SE 2016.
3 Sep. 2016
TBD
CREOLE LANGUAGE
Pidgin English (Hawaii Official)
Simplified grammar;
natural learning curve;
Use language without
knowing one

SE 20016 - programming languages landscape.

  • 1.
    PROGRAMMING LANGUAGES LANDSCAPE OLD& NEW IDEAS Ruslan Shevchenko VertaMedia/ Researcher ruslan@shevchenko.kiev.ua https://github.com/rssh @rssh1
  • 2.
    PROGRAMMING LANGUAGES LANDSCAPE:OLD & NEW IDEAS What tomorrow programming will be like. Languages Complexity Hardware Worlds Learning Curve Expressibility Layers
  • 4.
    ASM PASCAL BASIC JAVASCRIPT ASM C/C++ TCL JAVA C RUST SCALA (?) JS;JULIA(?) CREOL ENGLISH? ??? QUANTUM ??? 2000 - 2020 80 - 2000 2020 - 20XX J* ?? 20XX - YYYY COBOL
  • 5.
    Hardware 1 Processor Unit 1Memory Unit 1 Machine N Different Processors (CPU, GPU, NTU, QTU) N Different Storage Systems (Cache, Mem, SSD, ..) N Different Machines PL: Main Language Constructs: still execution flows
  • 6.
    Memory Access Evolution: Fortran57 : static memory allocation Algol 60 : Stack Lisp 58: Garbage Collection BCPL, C [70] - manual ObjectiveC [88] — reference counting Java [95] — Garbage collection become mainstream. Rust [2010-15] — compile-time analysis become mainstream C++ [..88] — manual + destructors Algol 68: - Stack + manual + GC Smalltalk [72-80] — GC (RC as GC optimization) Objective C++ ML [70] - compile time analysis Simula 67 // not all, not main
  • 7.
    Memory Access Evolution: Manualallocation: Risky, low-level Garbage Collection: Generally Ok, but pauses: not for Real-time systems not for System-level programming Type analysis [RUST] subculture of sun.misc.Unsafe in java
  • 8.
    RUST: ownership &lifecycle T - object of type T (owned by code in scope) &T - borrowed reference to type T (owned not by us) &’L T - reference to type T with Lifetime L mut T - mutable object of type T * T - row unsafe pointer let y: & str { let email = retrieve_email(….. ) let domain = first_entry(email,”@“) y = domain } // not compiled, lifetime of y is in outer scope. fn first_entry(value: &’a str, pattern: &’b str) -> &’a str
  • 9.
    RUST: general Next stepin low-level system languages. Zero-cost abstraction + safety more difficult to write in comparison with GC lang. fast and easy in Comparison with C [may-be C++] Alternatives: advanced GC [go, D, Nim ]
  • 10.
    Concurrency Models Evolution: Fortran57 : one execution flow PL/1 64 : Multitasking API 1972: Actor Model 1988: Erlang [ Actor Model implementation] 1978: CSP Model 1983: Occam [1-st CSP Model Implementation] 1980: Implicit parallelism in functional languages (80-1) 1977. Future [MultiLisp] 2007: Go (CSP become mainstream) 2010: Akka in Scala (Actor Model become mainstream) 2015: Pony [actors + ownership] // not all, not main
  • 11.
    Concurrency Models: Callbacks: [manual],Futures [Semi-manual] hard to maintain Actor-Model (Active Object) CSP Channels; Generators Async methods. lightweight threads [coroutines, fibers .. ] execution flow ‘breaks’ thread boundaries. Implicit parallelism hard to implement, not yet in mainstream
  • 12.
  • 13.
  • 14.
    Async/Transform (by compiler/interpreter): defmethod():Future[Int] = async { val x = retrieveX() val y = retrieveY() x+y } def method():Future[Int] = async { val x = await(retrieveX()) val y = await(retrieveY()) x+y } class methodAsync { var state: Int val promise: Promise[Int] var x, y def m():Unit = { state match { case 0 => x = retrieveX onSuccess{ state=1; m() } case 1 => y = retrieveY on Success { state = 2; m() } case 2 => promise.set(x+y) } }
  • 15.
    Concurrency Models /current state Problems: Data Races. Possible solutions: immutability (functional programming) copy/move semantics [Go, Rust] static alias analysis [Rust, Pony] Async IO interfaces. Future: Heterogenous/Distributed case Implicit parallelism
  • 16.
    RUST: race control T<: std::marker::Send — it is safe to send object to other thread — otherThread(t) is safe T <: std::marker::Sync — it is safe to share object between threads — share = send reference —- otherThread(&t) is safe { let x = 1 thread::spawn {|| do_something(x) } } // error - lifetime of x { let x = 1 thread::spawn {move|| do_something(x) } } copy of original
  • 17.
    Pony: Actors Type Analysis fordata sharing. Pony Type - type + capability — T iso - isolated — T val - value —- T ref - reference —- T box - rdonly —- T trn - transition (write part of the box) —- T tag — identity only Destructive read/write fut test(T iso a) { var ref x = a } // error - fun test(T iso a){ var iso x = consume a // ok var iso y = a // error - a is consumed }
  • 18.
    Distributed computations: Thread boundaries+ Network boundaries Locality hierarchy Failures
  • 19.
    val lines =load(uri) val count = lines.flatMap(_.split(“ “)) .map(word => (word, 1)) .reduceByKey(_ + _) Scala, count words: // Same code, different execution
  • 20.
    val lines =load(uri) val count = lines.flatMap(_.split(“ “)) .map(word => (word, 1)) .reduceByKey(_ + _) Java, count words: // Same code, different execution @Override public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { String line = (caseSensitive) ? value.toString() : value.toString().toLowerCase(); for (String pattern : patternsToSkip) { line = line.replaceAll(pattern, ""); } StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); Counter counter = context.getCounter(CountersEnum.class.getName(), CountersEnum.INPUT_WORDS.toString()); counter.increment(1); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } }
  • 21.
    @Override public void map(Objectkey, Text value, Context context ) throws IOException, InterruptedException { String line = (caseSensitive) ? value.toString() : value.toString().toLowerCase(); for (String pattern : patternsToSkip) { line = line.replaceAll(pattern, ""); } StringTokenizer itr = new StringTokenizer(line); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); Counter counter = context.getCounter(CountersEnum.class.getName(), CountersEnum.INPUT_WORDS.toString()); counter.increment(1); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Java, count words: // Same code, different execution Can we do better (?) - Yes [but not for free] - retargeting stream API (impl.effort) - via annotation processor - use byte-code rewriting (low-level)
  • 22.
    Java, count words(2): //Near same code, different execution List{???}<String> lines = load(uri) int count = lines.toStream.map(x ->x.split(“ “)) .collect(Collectors.group{Concurrent,Distributed}By(w->w, Collectors.mapping(w->1 Collectors.reducing(Integer::Sum))) [distributed version is theoretically possible]
  • 23.
    Ideas Language Extensibility: F:A=>B F: Expr[A] => Expr[B] • Functional interpreters: Expr[A] build on top of L • well-known functional programming pattern • Macros: Expr[A] == {program in A} • Lisp macroses [1960 … ] • Compiler plugins [X10], • Non-standard interpretation of arguments [R] Reach enough type system, to express Expr[A] (inside language)
  • 24.
    Language Extensibility: F:A=>B F: Expr[A] => Expr[B] Small example (functional compiler) trait GE[T] Code( val fundefs: Map[String, String] val expr: String, ) trait GERunner { def loadValues(Map[String,Array[Double]]) def loadCode(GE[_]) def run() def retrieveValues(name:String):Array[Double] } // GPU contains OpenCL or CUDA compiler // available via system API
  • 25.
    case class GEArray(name:String)extends GE[Array[Double]] { def apply(i:GE[Int]): GE[Double] = GEArrayIndex(this,i) def update(i:GE[Int],x:GE[Double]): GE[Unit] = GEUpdate(this,i,x) def index = new { def map(f: GE[Int] => GE[Double]):GE[Array[Double]] = GEMap(this,f) def foreach[T](f:GE[Int] => GE[T]):GE[Unit] = GEForeach(this,f) } } case class GEPlus(x: GE[Double], y: GE[Double]) extends GE[Double] implicit class CEPlusSyntax(x:CE[Double]) extends AnyVal { def + (y:CE[Double]) = CEPlus(x,y) } case class GEMap(a:GE[Array[Double]],f:GE[Int]=>GE[Double]) case class GEArrayIndex(a: GE[Array[Double]],i:GE[Int]) extends GE[Double] case class GEConstant(x:T):GE[T] case class GEVar[T](name:String):GE[T]
  • 26.
    val a =GEArray[Double](“a”) val b = GEArray[Double](“b”) val c = GEArray[Double](“c”) for( i<- a.index) { c(i) = a(i) + b(i) } a.index.foreach(i => c(i) = a(i)+b(i) ) a.index(i => GEArrayIndex(c,i).update(i, GEArrayIndex(a,i)+GEArrayIndex(b,i))) GEForeach(i => (GEUpdate(c,i), GEPlus(GEArrayIndex(a,i),GEArrayIndex(b,i)))
  • 27.
    trait GE[T] case classGPUCode( val defs: Map[String,String] val expr: String ) class GEArrayIndex(x:GE[Array[Double]], i:GE[Int]) { def generate():GPUCode = { val (cx, ci) = (x.generate(),i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } } GEArrayIndex(GEArrayVar(a),GEVar(i)) => “a[i]” class GEIntVar(name:String) .. { def generate():GPUCode = GPUCode( defs = Map(name -> “int ${name};”) expr = name) }
  • 28.
    trait GE[T] case classGPUCode( val defs: Map[String,String] val expr: String ) class GEArrayIndex(x:GE[Array[Double]], i:GE[Int]) { def generate():GPUCode = { val (cx, ci) = (x.generate(),i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } } GEPlus(GEArrayIndex(GEArrayVar(a),GEVar(i)), GEArrayIndex(GEArrayVar(b),GEVar(i)) => “a[i] + b[i]” class GEPlus(x:GE[Double], y:GE[Double]) { def generate():GPUCode = { val (cx, cy) = (x.generate(),y.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr} + ${cy.expr})”) } }
  • 29.
    trait GE[T] case classGPUCode( val defs: Map[String,String] val expr: String ) class GEArrayIndex(x:GE[Array[Double]], i:GE[Int]) { def generate():GPUCode = { val (cx, ci) = (x.generate(),i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } } c.update(i,a(i)+b(i)) => “c[i] = a[i] + b[i]” class GEPlus(x:GE[Double], y:GE[Double]) { def generate():GPUCode = { val (cx, cy) = (x.generate(),y.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr} + ${cy.expr})”) } } class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double]) { def generate():GPUCode = { val (cx, ci, cy) = (x,i,u) map (_.generate) GPUCode(defs = merge(cx.defs,cy.defs,ci.defs), expo = s”(${cx.expr} + ${cy.expr})”) } }
  • 30.
    trait GE[T] case classGPUCode( val defs: Map[String,String] val expr: String ) class GEArrayIndex(x:GE[Array[Double]], i:GE[Int]) { def generate():GPUCode = { val (cx, ci) = (x.generate(),i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } } GEPlus(GEArrayIndex(GEArrayVar(a),GEVar(i)), GEArrayIndex(GEArrayVar(b),GEVar(i)) => “a[i] + b[i]” class GEPlus(x:GE[Double], y:GE[Double]) { def generate():GPUCode = { val (cx, cy) = (x.generate(),y.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr} + ${cy.expr})”) } } class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double]) { def generate():GPUCode = { val (cx, ci, cy) = (x,i,u) map (_.generate) GPUCode(defs = merge(cx.defs,cy.defs,ci.defs), expo = s”(${cx.expr} + ${cy.expr})”) } } class GEForeach[T](x:GE[Array[Double]], f:GE[Int] => GE[T] ) { def generate():GPUCode = { val i = new GEIntVar(System.newName) val (cx, ci, cfi) = (x,i,f(i)) map (_.generate) val fName = System.newName val fBody = s””” __kernel void ${funName}(${genParamDefs(x)}) { int ${i.name} = get_global_id(0) ${cfi.expr} } “”” GPUCode( defs = merge(cx.defs,cy.defs,cci.defs,Map(fName,fBody)), expr = s”${fname}(${genParams(x)})”) } }
  • 31.
    trait GE[T] case classGPUCode( val defs: Map[String,String] val expr: String ) class GEArrayIndex(x:GE[Array[Double]], i:GE[Int]) { def generate():GPUCode = { val (cx, ci) = (x.generate(),i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } } class GEPlus(x:GE[Double], y:GE[Double]) { def generate():GPUCode = { val (cx, cy) = (x.generate(),y.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr} + ${cy.expr})”) } } class GEUpdate(x:GE[Double],i:GE[Int], y:GE[Double]) { def generate():GPUCode = { val (cx, ci, cy) = (x,i,u) map (_.generate) GPUCode(defs = merge(cx.defs,cy.defs,ci.defs), expo = s”(${cx.expr} + ${cy.expr})”) } } class GEForeach[T](x:GE[Array[Double]], f:GE[Int] => GE[T] ) { def generate():GPUCode = { val i = new GEIntVar(System.newName) val (cx, ci, cfi) = (x,i,f(i)) map (_.generate) val fName = System.newName val fBody = s””” __kernel void ${funName}(${genParamDef(x)}) { int ${i.name} = get_global_id(0) ${cfi.expr} } “”” GPUCode( defs = merge(cx.defs,cy.defs,cci.defs,Map(fName,fBody)), expr = s”${fname}($genParams(x))”) } } for(i <- a.index) yield c(i)=a(i)+b(i) => defs: “”” __kernel void f1(__global double * a, __global double * b, __global double* c, int n) { int i2 = get_global_id(0) c[i] = a[i]+b[i] }
  • 32.
    Finally: val a =GEArray[Double](“a”) val b = GEArray[Double](“b”) val c = GEArray[Double](“c”) for( i<- a.index) { c(i) = a(i) + b(i) } __kernel void f1(__global double*a, __global double* b, __global double* c, int n) { int i2 = get_global_id(0) c[i] = a[i]+b[i] } GPUExpr( ) // with macroses can be done in compile time
  • 33.
    Complexity Louse coupling (canbe build independently) Amount of shared infrastructure (duplication) Amount of location informations.
  • 34.
    Typeclasses: typeclasses in Haskell implicittype transformations in scala concepts in C++14x (WS, not ISO) traits in RUST A B B don’t care about AA don’t care about B & C Crepresentation of A
  • 35.
    A B C Typeclasses class GEArrayIndex(x:GE[Array[Double]],i:GE[Int]) { def generate():GPUCode = { val (cx, ci) = (x.generate(),i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } }
  • 36.
    A B C Typeclasses class GEArrayIndex(x:GE[Array[Double]],i:GE[Int]) { def generate():GPUCode = { val (cx, ci) = (x.generate(),i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } } class GEArrayIndex(x:GE[Array[Double]], i:GE[Int]) implicit object GEArrayIndexCompiler extends Compiler[GEArrayIndex,GPUCode] { def generate(source: GEArrayIndex):GPUCode = { val (cx, ci) = (source.x.generate(), source.i.generate()) GPUCode(defs = merge(cx.defs,cy.defs), expo = s”(${cx.expr}[${cy.expr}]”) } } trait Compiler[Source,Code] { def generate(s:Source):Code }
  • 37.
    A B C Typeclasses classString implicit object StringComparator extends Comparable[String] trait Comparable[A] string trait Ordered { fn less(x:&self, y: &self) -> bool } RUST: imp Ordered for string { fn less(x:&self, y: &self) -> bool { return …. } }
  • 38.
    Language features: Research =>Mainstream Type analysis Lightweights threads/async interfaces Metaprogramming Lousy Coupling a-la typeclasses Mostly research implicit parallelism distributed computing gradual typing language composition
  • 39.
    SE 2016. 3 Sep.2016 Questions. Ruslan Shevchenko ruslan@shevchenko.kiev.ua @rssh1 https://github.com/rssh
  • 40.
    See during SE2016. 3 Sep. 2016 TBD
  • 41.
    CREOLE LANGUAGE Pidgin English(Hawaii Official) Simplified grammar; natural learning curve; Use language without knowing one