Forums

IDE binding internals

General discussion about Cobra. Releases and general news will also be posted here.
Feel free to ask questions or just say "Hello".

IDE binding internals

Postby thriwkin » Wed Jan 08, 2014 6:48 pm

Subject:
How to make a Language Binding (LB) for the Cobra Language
to one of these IDEs:
- SD: SharpDevelop
- MD: MonoDevelop
- XS: Xamarin Studio
- VS: Visual Studio

I want to understand how an LB can be made,
I am not a user of one of these IDEs, I just want to understand how it works.

Hence
1. in the last years, occasionally, I studied the source code of these LBs:
- C#, Boo, IronPython in SD3, SD4,
- C# in SD5, MD/XS.

2. And because I could not understand it, I implemented such an LB (Cobra for SD4), and gradually I got the feeling that I understand it. That was when it had all the usual LB-features for the most frequent expression kinds. Now it became a little bit dull, just work, and certain bugs in the Cobra compiler..., so I stopped.

3. But some time ago I looked into the source of SD5:
they changed a lot of things, currently C# is the only working LB,
all the other SD4 LBs do not work anymore.
So, maybe, understanding SD5, that could be a new challenge.

4. And of course, I peeked a little bit into the source code of the LB of nerdzero:
- Cobra in MD/XS

5. The usual LBs are all similar. You have to make
- a parser or an AST conversion visitor
- expression finder
- resolver
Much work!
The LB of nerdero is quite different:
- it just uses the resolved AST of the Cobra compiler.
Much simpler!

Very, very interesting!
I wonder if this approach can lead to an LB with all the features of a C# LB.
Maybe it is a shortcut to the summit, or a dead-end street.
thriwkin
 
Posts: 26

Re: IDE binding internals

Postby thriwkin » Wed Jan 08, 2014 6:53 pm

About an issue of an LB (MonoDevelop.CobraBinding.0.5.2)
Reply to this issue analysis

Issues are undesirable if you want to use the IDE,
but they are very, very useful if you want to understand how the IDE and the LBs interoperate!

Therefore, two issues of this LB -- suitable for a little experiment that shows that
- the usual LBs work differently than this LB:

If a solution has two projects, Main and Lib, where Main has a project reference to Lib,
then:
(I1) Edit-and-build will block the output file of the project (Lib\bin\Debug\Lib.dll)
- nothing can be compiled anymore
- because the Cobra compiler uses `Assembly.loadFrom` to read this assembly.
See also: Test B and analysis.
(I2) If the source in Main references a symbol definition in Lib, for example `Lib.Class1`, then
- the "Goto Declaration" command will not work
- because the assembly in file "Lib.dll" has the symbol definition, but the location of the definition (filename, line, column) is in the program database file ("Lib.pdb" or "Lib.mdb").

Of course this could be changed:
- Read the assembly without file blocking.
- Read the program database file as well (case it is present -- depends on option "-debug").

But I think:
- an LB should not read these files at all!
- (S1) The usual LBs in SD, MD, XS never read the output assembly of a project!

If you want to check this sentence (S1) by reading the source code -- that would be like reading a book with 1000 pages.
But you can easily check this with the following test:

Test C
Outline
- Make a solution with two C# projects, linked together with a "Project Reference" from Main to Lib,
- delete all output assemblies !
- edit the source without "Build" and try out the LB features listed below.
When you never "Build", then no output assemblies will be created.
Hence: If all features work, then S1 is true.

Details
In SD or MD or XS:
  • Create a new solution, named "C", with a "C# Console Project", named "Main"
    (In XS: New Solution > C# > Console Project > Name: "Main", Solution name: "A".)
    In SD4 the language can be C#, VB, Boo, IronPython.
  • Add a "C# Library Project", named "Lib".
    In SD4 the language can be C#, VB, Boo, IronPython -- can be different from the language of Main.
  • in "Main" add a Project Reference to "Lib".
    (In XS: View Solution > Main > References > Edit References... > Projects.)
  • Now ensure that there are really no project output assemblies present:
    Select the "Clean All" command and check the directories "obj" and "bin".
  • The source should be correct. Pay attention to the "namespace" and "usings" in both projects.
    Change the source, do not "Save", do not "Build", such that no assembly file is created.
Now try out these LB features:
- View Classes (in XS: "Refresh" is necessary...)
- View Document Outline (in XS)
- Code Completion triggered by ctrl-space, ".", keyword
- Parameter Insight triggered by "(", "["
Point with the mouse onto an identifier:
- Code Insight (Tooltip)
Open the context menu and select:
- "Go to Declaration"
- "Go to base"
- "Find references"
- "Find derived classes"
- "Find overrides"
- "Refactoring > Rename"
- ...

Result
On my computer:
- all works,
- and you can "Goto Declaration", even if the two projects are of a different language,
- no output assembly is read -- cannot, because not present!


------------------
Can you imagine how this works?
Is there anyone in this forum who wants to understand how this is possible?
We could talk about this here in this thread, with concrete source code links and quotations, ...

But maybe you think that is not "Cobra", ..., that will waste the precious hard disk space of the server.
thriwkin
 
Posts: 26

Re: IDE binding internals

Postby nerdzero » Thu Jan 09, 2014 1:13 am

Your analysis of what is required to make a language binding addin is correct as is your analysis of the the current state of the MD Cobra addin.
thriwkin wrote:5. The usual LBs are all similar. You have to make
- a parser or an AST conversion visitor
- expression finder
- resolver
Much work!
The LB of nerdero is quite different:
- it just uses the resolved AST of the Cobra compiler.
Much simpler!

Very, very interesting!
I wonder if this approach can lead to an LB with all the features of a C# LB.
Maybe it is a shortcut to the summit, or a dead-end street.

With the current approach, we have a shortcut through the clouds, but I've discovered the road doesn't lead to the summit. At least we can see the summit now though :)

To get all the features of the C# binding, which I am using as a reference implementation along with the D language binding, we indeed will need a new parser and resolver. And you are right, it is a lot of work.

It's slow going, but I've been working on a Cobra source code analysis library, tentatively named Venom, which will include a parser and resolver compatible with NRefactory. At this point, it can only create an AST for a Hello World program or other basic program like adding two numbers and then pretty print the AST. :oops: I've recently taken a break from it and am working on some easier features in the add-in like more project options in the GUI for changing things like the target assembly type (i.e. is this a console project or library project?), adding an application icon to the generated exe, etc.

thriwkin wrote:Now try out these LB features:
- View Classes (in XS: "Refresh" is necessary...)
- View Document Outline (in XS)
- Code Completion triggered by ctrl-space, ".", keyword
- Parameter Insight triggered by "(", "["
Point with the mouse onto an identifier:
- Code Insight (Tooltip)
Open the context menu and select:
- "Go to Declaration"
- "Go to base"
- "Find references"
- "Find derived classes"
- "Find overrides"
- "Refactoring > Rename"
- ...

Result
On my computer:
- all works,
- and you can "Goto Declaration", even if the two projects are of a different language,
- no output assembly is read -- cannot, because not present!


------------------
Can you imagine how this works?
Is there anyone in this forum who wants to understand how this is possible?
We could talk about this here in this thread, with concrete source code links and quotations, ...

I would love to talk about how these work! One important key for the parser used inside an IDE is that it must be able to generate a valid AST for code that will not compile. This is different from the AST generated by the parser in the compiler because the compiler only needs to go so far before it tells the programmer: "Hey, you have errors here you need to fix before this will compile". This post on SO from Eric Lippert (with Microsoft at the time but no longer) was very insightful: http://stackoverflow.com/questions/9556 ... tellisense

It led me to do more research on Roslyn and how it represents tokens and nodes in the AST: http://msdn.microsoft.com/en-us/vstudio/roslyn.aspx There's a lot of good info in those links.

I'm fairly familiar with the C# binding in MD/XS although I'm no expert: https://github.com/mono/monodevelop/tre ... arpBinding

Lately I've been pouring over the NRefactory code specific to the C# AST: https://github.com/icsharpcode/NRefacto ... CSharp/Ast

I've also been examining how Alexander Bothe creates the document outline in the D binding: https://github.com/aBothe/Mono-D/blob/m ... tension.cs That dude is pretty impressive.

Most of this research has been in support of better code completion. Implementing other features like refactoring/renaming, go to base class, etc. is possible with the current approach, but a more robust parser would make things a bit easier (but it's hard to make that parser). Also, a separate resolver would greatly improve performance. Right now, I make the Cobra compiler library do a lot of work that gets thrown away on the next key stroke :lol:

So, that's a lot of stuff I just typed but it doesn't really explain how project references work in other bindings without requiring an actual assembly which is what your original question was about. The answer, at least for MonoDevelop/Xamarin Studio, is the NRefactory typesystem. It doesn't matter if it's an assembly reference or a project reference, they are both converted into the same data structures so that types and other information can be resolved the same way. I recommend reading this: http://www.codeproject.com/Articles/408 ... sharp-code paying particular attention to the "Type System" section. Also, peruse this code: https://github.com/icsharpcode/NRefacto ... TypeSystem and then we can discuss more if you like. Also, I don't know much about the SD language bindings. It might be interesting to see how they do it as well. Do you happen to know where their source is hosted?
nerdzero
 
Posts: 286
Location: Chicago, IL

Re: IDE binding internals

Postby nerdzero » Thu Jan 09, 2014 1:34 am

Actually reading over your post again I see you mentioned that the latest version of SD supports only C# when doing project references. So I imagine the other language bindings didn't make the switch to NRefactory and are now broken.
nerdzero
 
Posts: 286
Location: Chicago, IL

Re: IDE binding internals

Postby thriwkin » Mon Jan 13, 2014 7:40 pm

nerdzero,

thanks for your long reply, sorry for the long delay of my reply.
I would love to talk about how these work!

The length of ower prior messages seem to indicate that
"talk" in an internet forum seems to be not possible for this complicated stuff.
But maybe this is possible:
an exchange of infos chopped into message pieces, in chaotic order, but linked via quotes.
thriwkin
 
Posts: 26

Re: IDE binding internals

Postby thriwkin » Mon Jan 13, 2014 7:43 pm

Porting from SD4 to SD5 ?

The 'symbol table' of SD5/MD/XS is the same -- it is the so-called 'type system' of NR5 (NRefactory 5).
The 'symbol table' of SD4 is quite different -- it is a part of the so-called 'DOM' of SD4.
Nevertheless: the C# binding of SD4 and SD5/MD/XS are essentially the same,
and the trend is: convergence of SD5 and MD/XS, divergence of SD4 and SD5.

My Cobra-SD4 binding works quite well on SD4
(with multiple projects and languages, and code insight/navigation/completion...).

Porting
- from SD2 to SD3: was simple,
- from SD3 to SD4: was more work,
- from SD4 to SD5: would be very, very much work, because of the different 'symbol table',
and would be too early, because of the many bugs I expect when they present a fresh, new "official release".

So I am not eager to do this, and presumably the other SD4-LB writers are thinking similar.
thriwkin
 
Posts: 26

Re: IDE binding internals

Postby nerdzero » Mon Jan 13, 2014 8:00 pm

thriwkin wrote:The length of ower prior messages seem to indicate that
:D Sorry, I get excited. I will try to keep it to bite-size pieces :)

thriwkin wrote:Nevertheless: the C# binding of SD4 and SD5/MD/XS are essentially the same

thriwkin wrote:- from SD4 to SD5: would be very, very much work, because of the different 'symbol table',

Hmm, so what makes the C# bindings in SD4 and the other IDEs essentially the same but yet means it is so much more work for other languages to update? What did the C# bindings do differently that made it easier?
nerdzero
 
Posts: 286
Location: Chicago, IL

Re: IDE binding internals

Postby thriwkin » Mon Jan 13, 2014 9:50 pm

The C# binding of SD is developed by the core developer team of the SD project,
they change SD, and this makes it necessary to change the C# binding.
(I hope they code with their own C#-SD binding, but I suspect that they use the latest Visual Studio to code their SharpDevelop stuff.)

If someone has made an add-in for SD-#(n), he has to wait until SD-#(n+1) is mature enough so that he can risk to port his add-in.
And if it is a language binding, then the source code of the C# binding, ported by the core developers, is the only available "documentation". (But I presume that the most contributers of the SD project can communicate somehow with the core developers, e.g. because they are students of computer science at the same university.)

"Essentially the same": all working LBs of all these IDEs are essentially the same, because these IDEs are essentially the same. And your MD-Cobra LB is essentially different from all these working LBs.
In what respect? Maybe one of the next postings in this chaotic thread will contain an answer to this question, just by the way.
thriwkin
 
Posts: 26

Re: IDE binding internals

Postby thriwkin » Mon Jan 13, 2014 9:55 pm

Parsing: whereto ?
nerdzero wrote:I'm fairly familiar with the C# binding in MD/XS although I'm no expert:
https://github.com/mono/monodevelop/tree/master/main/src/addins/CSharpBinding
[...]
I recommend reading this:
http://www.codeproject.com/Articles/408663/Using-NRefactory-for-analyzing-Csharp-code
paying particular attention to the "Type System" section.
Also, peruse this code:
https://github.com/icsharpcode/NRefactory/tree/master/ICSharpCode.NRefactory/TypeSystem
and then we can discuss more if you like.


I am quite familiar with all the links you posted, especially those you "recommend" me to "peruse",
and especially with the
- source code of the C# binding of MD/XS.

Did you peruse this text?

This C# binding maps the
- symbol definitions in the C# source text
to the
- symbol table of the IDE.
and
- when a symbol reference is to resolve, it uses this symbol table of the IDE.

But the MonoDevelop.CobraBinding behaves quite differently:
- the symbol table of the IDE remains empty,
instead, it maps the
- symbol definitions in the Cobra source text
to the
- symbol table of the Cobra compiler, a part of the compiler's AST,
and then
- the whole symbol table is resolved,
and later
- when a symbol reference is to resolve, it retrieves the definition from this resolved symbol table.

Since July 2012, the beginning of your project, I wonder about this:
- the source code of this C# binding is denoted as a "Reference implementation...",
- in the source code of the MD-Cobra binding the name "TypeSystem" apears,
- but this hungry TypeSystem is not feeded with symbol definitions,
- and then this changeset appeared: INode.goToDefinitionLocation for IDEs
(wonder, wonder, ...)

And because I know that this cannot work with multiple projects,
I posted these tests.
I was hoping to get an answer to this question:

Was this a conscious decision to do it differently as the C# binding?
thriwkin
 
Posts: 26

Re: IDE binding internals

Postby nerdzero » Mon Jan 13, 2014 10:51 pm

thriwkin wrote:I am quite familiar with all the links you posted, especially those you "recommend" me to "peruse",
and especially with the
- source code of the C# binding of MD/XS.

Awesome, you can probably teach me how I can improve the add-in then.

I'm getting hung up a little on your terminology though. When you say "symbol table of the IDE" are you speaking of a symbol table in the abstract sense or can you give me a link or two to the code of the classes/interfaces you are talking about? Or do you mean the implementation of the NRefactory interfaces for the C# binding? You make it sound like there's a much easier way than what I am doing.

thriwkin wrote:Was this a conscious decision to do it differently as the C# binding?

Yes, it was a conscious decision to do it differently but if you are asking if I did it this way for some kind of benefit over the C# binding then the answer is "no". The approach I've been using since I started this project is to stare at Cobra compiler code, stare at MonoDevelop IDE code, try to understand the data structures and how they interact, try to bind them together and see if it works. Rinse and repeat. Of course it's been some time since I started so I have a better understanding of things but I still have a lot to learn. If you can offer some tips or specific bits of code I should be leveraging I would appreciate it.
nerdzero
 
Posts: 286
Location: Chicago, IL

Next

Return to Discussion

Who is online

Users browsing this forum: No registered users and 3 guests