A few questions about the state/architecture of csnative

Aug 11, 2015 at 1:48 PM
Hi,

Thought id create a separate thread for some more general questions:
  1. How is the MSIL to CPP conversion done? Is it more mature than the C# to MSIL or less so?
  2. Is there anything the C# to CPP converter doesn´t support?
  3. Which C# language and MSIL version is supported by the converters ( for example .net 4.6 introduced some new syntactic sugar etc)?
  4. How is GC handled when compiling with emscripten and emcc? With g++ i can see you link in a GC lib, but is something similar done for emscripten?
The reason i am asking all these questions is to determine how likely it is that csnative could be used (or fairly easily expanded) to generate asm.js for some of the core APIs we have were i work (which have no UI, db, web or interop dependencies, but are currently targeting .net 4.5 ).
Coordinator
Aug 11, 2015 at 9:53 PM
Edited Aug 11, 2015 at 9:54 PM
1) It reading IL Byte code from DLLs and generating C code with exceptions from C++ to execute it and get the same result

for example you have C# code
using System;

class X {
    public static int Main (string [] args)
    {
        Console.WriteLine ("Hello, World!");
        return 0;
    }
}
it generates MSIL code which simply loading string "Hello, World!" and calling method Console.WriteLine. So il2c generate the same code in C code

2) Yes for now it does not support many things and C# is not just C#, it is bug VM which has name .NET Framework

for now it does not support:
  • Varince
  • Struct Layout
  • __arglist, __makeref, __reftype, __refvalue (partially it supports them)
  • DllImport as C# does
  • Serialiation
  • Reflection
  • Dynamic
  • Creating Runtime types etc
all of it can be implemented but is was not the aim of this project. The main aim of this project to use C# instead of C++ to compile binary code.

3) il2c converts all MSIL codes into C code so it does not depend on any version. (for example it does not matter how many new syntax sugar you are going to use as it will be converted to the same MSIL code which was written for .NET Framework 2.0)

4) GC is not enabled for emscriten at all as JavaScript VM is responsible for allocating object etc
Aug 12, 2015 at 6:35 AM
Edited Aug 12, 2015 at 6:36 AM
1) It reading IL Byte code from DLLs and generating C code with exceptions from C++ to execute it and get the same result
Then why implement both CS -> CPP and MSIL -> CPP? Or do does it internally compile to MSIL before generating CPP?
2) Yes for now it does not support many things and C# is not just C#, it is bug VM which has name .NET Framework

for now it does not support:
  • Varince
  • Struct Layout
  • __arglist, __makeref, __reftype, __refvalue (partially it supports them)
  • DllImport as C# does
  • Serialiation
  • Reflection
  • Dynamic
  • Creating Runtime types etc
The only thing I need here is the struct layout, but I guess that's not too tricky to implement.
3) il2c converts all MSIL codes into C code so it does not depend on any version. (for example it does not matter how many new syntax sugar you are going to use as it will be converted to the same MSIL code which was written for .NET Framework 2.0)
Okay, so its preferable to run il2c on MSIL assemblies instead of the C# source code?
4) GC is not enabled for emscriten at all as JavaScript VM is responsible for allocating object etc
Hmm, I thought asm.js modules ran in a virtual heap (basically a big typed array in JS), meaning that the JS VM doesn´t know about objects allocated on the virtual heap.

If you look at for example Lua VM in asm.js you will see that explicitly state that they also compile the garbage collector:

"Are you really porting the entire Lua VM?
Yes: The entire Lua 5.2.3 codebase written in C is compiled to JavaScript here, including a full incremental GC and everything else. It fits in 170K of gzipped JavaScript."

Is there anything preventing just compiling libgc with emscripten and redirect memory allocations to that gc engine (which I guess is aleady done when not specifying the /emscripten option?).
Coordinator
Aug 12, 2015 at 8:42 AM
Edited Aug 12, 2015 at 8:47 AM
il2c works with MSIL only and it has Roslyn compiler to compile C# to MSIL
Is there anything preventing just compiling libgc with emscripten and redirect memory allocations to that gc engine (which I guess is aleady done when not specifying the /emscripten option?).
Nothing is preventing from doing it, I just believe that hboehm GC lib is not compilable for emscripten (I have not tried myself)
Aug 12, 2015 at 8:49 AM
Ah okay, that makes sense, you only ever deal with MSIL.

You didn´t answer my question regarding GC? Just want to verify that this would have to be added to use asm.js (well that or run out of memory :)

Might I also ask why you stopped doing MSIL -> LLVM and switched to MSIL -> CPP? (you have the llvm tool marked as deprecated).
Coordinator
Aug 12, 2015 at 9:45 AM
to stop doing double work. as C which I am generating as close to the metal as LLVM IR which means just run clang and get your LLVM IR (Byte Code) files
Aug 12, 2015 at 10:23 AM
Regarding GC, do you think its a big effort to get that working? I was considering trying to see if I could build the CoreCLR's GC stand-alone with emscripten, if that is possible, I would hope that it wouldn't be too hard to integrate with csnative.
Coordinator
Aug 12, 2015 at 10:32 AM
yes, if you manage to extract CoreCLR's GC I will happy to embed it too
Aug 15, 2015 at 8:42 PM
I will give it a go once I have some spare time. It seems to be a fairly isolated component. Since the are targeting Linux I would expect that it to compile with gcc,so hopefully its not a big deal.