Data access with ADO.Net Entity Framework

Topics: Developer Forum
Jun 1, 2008 at 1:44 PM
Edited Jun 1, 2008 at 1:53 PM
I've played with various aspects of manipulating data with N2 API and faced some challenges i'd like to share.

Firstly, the task of importing some hundred thousands records into N2 through N2.Persistence went almost smoothly. N2.Tests project gives an excellent clue on how N2 internals work and provides a basis for a custom code for data manipulation.

However, when I attempted mining my data through the very same N2.Persistence API (thus by-passing N2 UI level), gave a lot of headache because of inefficiency of NHibernate auto-generated SQL queries. The biggest plague was that for each item retrieved, N2 generates a round-trip to the database for getting a record from n2AllowedRole table. This has a dramatical impact on mass-selection perfromance and, evidently, suggests some insightful tweaking in N2/NHibernate machinery, for which i wasn't yet mentally prepared. Optimising Sql Server 2005 database with indexes gave almost no relief. I've gave a spin to Ayende Rahien's Linq to NHibernate toolkit, but it only revealed the bottleneck to be somewhere deep in the NHibernate throat.

Meanwhile, I took a different route and leaned toward a technology i've got recently more familiar with: Language Integrated Query (LINQ). The result is, so far, an ADO.Net Entity Framework ("EF", hereafter) model, which is suitable for a low-level typed access to N2 database.

Here are some advantages of using EF as a persistence layer we've got so far with my solution:
  • Fine-grained control over lazy/greedy loading by moving appropriate directives from a data model definition directly to the code (you do it yourself: either explicit .Include call in the Linq query setup in the beginning -OR- make an implicit reference to a desired association right in the query body, before materialising a query result on a client)
  • Linq -- for me as a developer, this is a most compelling "pro" -- we obtain a familiar, statically typed data access, natively supported by CLR, with all related goodies, the biggest one being Intellisense. (Here i must notice, that Linq to Entities is a bit different beast than these two ubiquitous Linq to Objects and Linq to XML in a mean that it poses some restrictions on what's allowed in your queries and what's not. Also, prepare for some challenges if you're using heterogeneous queries. Namely, inability to do a simple (as it seems from the first glance), join on local array of strings, leads to a pretty non-trivial work-around.)
  • EF has a good conceptual correspondence to NHibernate's declarative model definition capabilities, so someone familiar with the latter could quickly accustomize with the former
  • Native design-time support in Visual Studio. With arrival of Visual Web Developer 2008 SP 1, you even get it for free. (Though, beware, EF designer is not free from bugs. The most annoying one appears to be not a glitch in EF itself, but rather that of a Visual Studio designer, so it has no impact on run-time behavior and should be disregarded in design-time.)
Now, enough fluff, here's how you query the sample n2 database via EF:

        using (var _context = new n2cmsModel.n2cmsEntities()) {
            var _pages =
                from _page in _context.n2Item
                    .Include("Details")
                    .Include("Parent")
                where _page.Type == "PageItem"
                select _page;

            ///materialise result on a client:
            // .. = _pages.AsEnumerable();
        }

Finally, you might exclaim: "Show, don't tell!". Well, i've came up a brief complete sample. This is a simple single-page example of how to query N2 database via pure EF, without any references to N2, NHibernate or Castle, assemblies. For it to work you'll need to drop a non-empty n2 database into ~/App_Data. (I've experimented with that from a SimpleWebSite sample, coming from N2 sources.)

Good luck!
Coordinator
Jun 2, 2008 at 9:34 PM
Cool. EF has been on my to check list for a while now. Thank you for putting it in such a familiar context =) I'll willingly admit I've not prioritized performance, especially insert performance, and I've leaned on the build in caching for read performance.
Jun 8, 2008 at 1:43 PM
Edited Jun 8, 2008 at 3:13 PM
Yes, Cristian, EF is cool (though, not as solid yet, as NHibernate is). Among several outcomes of using EF, there is one i'm exploring right now and like to share some bits.

To use ASP.Net Data Service (once known as "Astoria") and magically get your N2 data model exposed through RESTfull API, one will need to do as much as perform the following steps:

1. Drop n2.edmx file (borrow it from the sample i've mentioned previously) into ~/App_Code folder
2. Add a new Wcf Service from a solution template (e.g.: Catalog.svc)
3. Add reference to System.Data.Services assembly.
4. Modify an .svc file to look like the following:

<%@ ServiceHost
        Language="C#"
        Debug="true"
        Service="Catalog"
        Factory="System.Data.Services.DataServiceHostFactory"
        CodeBehind="~/App_Code/Catalog.cs" %>

(the Factory attribute is essential to avoid boring hacking of a web.config)
5. Make sure a code-behind file look like the following:

using System;
using System.Linq;
using System.Runtime.Serialization;
using System.ServiceModel;
using System.ServiceModel.Activation;
using System.ServiceModel.Web;
using System.Data.Services;
using n2cmsModel;

public class Catalog: DataService<n2cmsEntities>
{
    public static void InitializeService(IDataServiceConfiguration config)
    {
        config.SetEntitySetAccessRule("*", EntitySetRights.All);
    }
}

6. Add the following lines to web.config:

    <system.serviceModel>
        <serviceHostingEnvironment aspNetCompatibilityEnabled="true"/>
    </system.serviceModel>

Now you can play with your data directly in a web browser and ask for something like this:

http://localhost:1963/N2CatalogWeb/Catalog.svc/n2Item?$filter=(isof('n2cmsModel.Product'))&$top=10

(Of course, this was my data model, but you've hopefully got a clue, right? ;-)

Good luck!
Coordinator
Jun 8, 2008 at 10:34 PM
Sweet. I checked your example and I must say it looks rather promising with out of the box REST and all.

Do you know how hard it is to build the model programamtically? (to support the way the content items are defined in n2)
Feb 14, 2009 at 9:20 PM
Edited Feb 14, 2009 at 10:59 PM
Cristian, it took me 8 mon. to understand your question about building the model programatically!

It also took a whole 2 weeks of precious programming time to finally realise that the new shiny ADO.Net Data Services thing (aka "Astoria") not gonna play with N2.. Just to save the time of anyone willing to experiment with the stuff, I'm including a simple test bed project, that contains several example data services and a generic model generator for N2. I'll discuss it later on.

In essence, the problem with Astoria vs. N2 is that (as far as i see the picture) is conceptually targeted at relational data sources and the possibility to map them to semi-hierarchical data. N2, in turn, already provides a complete hierarchical model without a trace of it's relational heritage. Say, we want to access a "Verify Your Email" item (this is from N2.Templates sample data). Intuitively, it should be possible to do this with the following Linq-like query:

  RootPage
    .StartPages
    .First()
    .TextPages
    .Single(_page => _page.Title == "Features")
    .Registrations
    .First()
    .Redirects
    .Single(_redir => _redir.Title == "Verify Your Email");

Wouldn't be nice to do this from a client via Ajax call ? (Indeed, a ModelProvider in my sample package is doing almost exclusively this! It generates a hierarchical model of entities each having a number of public IQueryable<SubEntityN> SubEntitiesN { get; } properties, so you can do such queries. But not from the client-side and not with Astoria, unfortunately.)

In turn, Astoria is forcing you to have a model that looks something like this:

class Model
{
  public IQueryable<TextPage> TextPages { get; }
  public IQueryable<Registration> Registrations { get; }
  public IQueryable<Redirect> Redirects { get; }
  //..the same for every other type ever encountered on any level
}

Sure, any of the types above (TextPage, Registration, Redirect) can have a collection properties with items of the former types:

class TextPage
{
  public IEnumerable<Registration> Registrations { get; }
}

But the reality is that you cannot chain such properties, that is: the following query is illegal in Astoria:

TextPages(1)/Registrations(9)/Redirects(17)/Title/$value

Instead, you'll have to do the crap like this:

  1. Issue the following query: /TextPages(1)/Registrations
  2. Get all Registrations and try to find an ID of a needed one
  3. Issue the another query (assuming the previous yielded 9): /Registrations(9)/Redirects
  4. Repeat [2 - 3] for every level.

I think i woldn't like to use such crappy stuff..

What I'm thinking now: is there any widespread standard on querying hierarchical data via web services ? A good candidate i've found so far is Dojo Data Store...

Now, a little bit about what's inside the attached package:

  • BookService.svc -- a very simple in-memory data model exposed through Astoria (luckily, no dependency on EF or other fancy M$ stuff)
  • DataService.svc -- not working attempt to expose a NHibernate.Linq context through Astoria. (Most probably you'll want to exclude it from the project, coz it won't compile without much of a black magic. Again, it does NOT work anyway.)
  • ModelProvider.cs -- honestly, the most promising piece of code (which might be valuable even out of data services context) -- it's a CodeDom-based model generator mentioned above. It traverses a given ContentItem hierarchy and builds a hierarchical set of nested classes, with each such class's children exposed via IQueryable properties.
  • Model.ashx -- a wrapper over ModelProvider that generates a C# source of a current site model straight into the browser
  • Model.cs -- an output of a Model.ashx (you'll need to paste it manually, as a I've found no way to utilise an ASP.Net Build Providers so far)
  • WebDataService.svc -- not working data service over Model.cs, that sets a bold dot in this sad story.

N2DataServices
Coordinator
Feb 16, 2009 at 7:55 PM
Did you manage get those classes over the wire? Anyhow, I've been loocking at the code and it's quite an interesting approach I must say. I'm once again amazed by the tricks you pull.
Feb 20, 2009 at 2:47 PM
No, Astoria cannot crack ContentItem & Ko (it doesn't like polymorphic behavior of Details, Children and other collections; also, it's so silly that it doesn't use TypeDescriptor API to exclude properties in questions neither it is extensible enough to turn in custom providers -- basically everything is 'internal'. So you have to edit source directly, but such practice smells..).

What i've realised is that you don't want to think of a universal data access web service without a close tie to existing .aspx/.ascx/.mvc view infrastructure. What i mean is this: a beast i call "the universal data access web service" consists of two cornerstones: 1) resource (in a common sense) location scheme; 2) data serialization format. The first is already present in N2, isn't it ? The second could most probably be outsourced to WCF gears (think JSON serializer).

The major question for me now is: where is it better to plug in the logic which will route resource requests either to views (as it is now) OR to data dumper: RequestLifeCycleHandler, IRequestDispatcher, IRequestController, something else ?

Next tasks will be to decide: 1) how to enable client to perform a view-or-data selection (Astoria solves this either by setting http headers OR appending a $format argument in the query string); 2) how to control lazy/greedy loading of the graph (my fantasy here goes that far to imagine a sort of data access layer interceptor, that will "listen" what NH is requested to do during "normal" view rendering, but in a last moment, instead of returning the view to a client, it could dump a collected objects).
Coordinator
Feb 21, 2009 at 1:33 PM
The simplest approach is probably something like a WCF service serving each item as a data transfer object.

A thin wrapper on the client would map this back to polimorphic objects (if this is needed), and also insert custom lazy children collections. Is it a "generic" n2 content items over the wire solution?

Another interesting approach would be a more restful stlyle where the service url would map to the content hierarchy and associations would be links.

A simple solution to querying would be re-using the finder api in N2, serializing the query on the client and deserializing it on the server side.
Feb 24, 2009 at 1:21 PM
Edited Feb 24, 2009 at 1:24 PM
At last i've got somethig that works !

This web data access service is of two bits:

  1. A generic http handler that returns JSON by a ContentItem's ID (format is close to that of Astoria, with a "__deferred" object and uri property for lazily loaded entities)
  2. A patch for N2's BaseController that triggers the above handler by $format=json in a "usual" item's url (again, this is Astoria's behavior)

==Some examples==

  • "/?page=2&$format=json" -- get a page #2 (a start page) in JSON
  • "/page1/sub-page2/?$format=json" -- get a sub-page in JSON
  • "/?$format=json" -- trying to get a start page, but for some reason CurrentItem isn't resolved
  • "/?page=2&item=4&$format=json" -- trying to get a non-page item #4, but got it's parent page #2, the work-around is the following..
  • "/?page=4&$format=json" -- get a non-page item #4, despite that a "usual" query as below would fail..
  • "/?page=4" -- receive an invalid cast exception, which is an expected result though

==Technicalities==

The trickiest part -- a JSON serialization -- is based on .. surprise! .. a heavily patched Json.Net library, that turned to be the most adequate solution among all i've tried (the solution that sucks less, to be fair):

  1. WCF's DataContractJsonSerializer is a heavy-weight monster, that requires a lot of boiler-plate code just to start going (meanwhile, moved to a todo list)
  2. ASP.Net MVC's JsonResult implementation is "internal"
  3. ASP.Net Ajax Extensions JavaScriptSerializer is quite a lightweight, but unfortunatelly lacks extensibility, so it immediatelly fails on a ContentItem because of common serializer's plague: cycles detection ("stack overflow")
  4. ADO.Net Data Services (Astoria) is a right thing but, again, is "internal"
  5. AjaxPro's approach is quite interesting overall (i especially like auto-generated proxies for server methods) (after looking at sources and author's blog, something suggests me that M$'s MVC might have some AjaxPro heritage)
  6. Json.Net is a simplest solution, that does fine just what it's name suggests. I patched it's somewhat archaic reflection-based property traversing to support the TypeDescriptor API -- a level of indirection needed to unintrusively control a seriazlization of 3rd party types.
Coordinator
Feb 24, 2009 at 7:45 PM
Looks cool.

I'm trying to think of the security implications of applying the patch. I'm thinking it would be possible for a malicious user to use this to read data that wasn't intended to be read that way. One could easily create a script to download all items including n2.security users.

The BaseController is meant as an extension point. Could it be used to restrict usage to items really meant to be read this way?
Feb 24, 2009 at 9:24 PM
Cristian, without a doubt current proof-of-concept solution should be reworked (that's why i even haven't put it into repository yet!), at least `item.Children[]` must be taken from the `item.GetChildren()` under the hood, and so on and so forth.. What i'm really curious in is what's your opinion about this approach in itself ? That is: instead of creating another unrelated client API, we could just try to mimic the existing server behavior on the client and reuse existing addressing scheme. Under the 'server behavior' i mean a convenient access to a this.CurrentItem on the client after calling some hypothetical `N2.Resources.Register.ShareCurrentItem(this.Page);`. What's your vision of a client-side API for N2 ? Could it be done this way, or is it better to seek for something else ?
Coordinator
Feb 24, 2009 at 10:15 PM
What do you think about hooking this up with the multiple templates concept? Something like:

/path/to/a/page would render the regular page
/path/to/a/page/json would render a json representation

Can you give more details on the ShareCurrentItem method? Would this enable access to a specific instance or all pages of a certain type?
Mar 3, 2009 at 6:42 PM
still got no answers to your questions, sorry :-/

meanwhile, i'd like to pay your attention to smth. off the n2 and even netfx context: what you think about persevere's approach to information management ? (i face a brain-power shortage event to think that big..)
Mar 10, 2009 at 2:15 PM
Edited Mar 10, 2009 at 8:55 PM
So, some reasoning as to how to build the Information Management System on top of N2.

My goal so far is:
«To achieve a unified addressing scheme for both "views" (.aspx/.ascx/.mvc/whatever..) AND a raw "data" (either JSON or AtomPub)»

If we start from a current situation and try to reuse N2's adressing as is, it soon becomes evident than just appending $format=json to a query string is not enough, because:
  • there need to be some semantics to express deferred loading of complex types/navigation properties/collections
  • there need to be an abstraction level above some system properties (e.g.: instead of giving access to .Children property, it makes sense to pipe it through .GetChildren() method)
  • the above property "abstraction" notion could be still expanded via special providers, which could transform item's properties interpretation for the purpose of our data service, or even create a new "virtual" properties (i'll rave on this a bit more below).
  • possibly, we'd like to address a ContentItem's methods too to make a programmer's server experience "portable" to the client

A good source of inspiration is Astoria spec: [ http://msdn.microsoft.com/en-us/library/cc668809.aspx ], where the following line is remarkable:
ADO.NET_DataService_URI = "http://" serviceRoot [ resourcePath ["?" queryOptions ] ]
I think it's possible to reuse Astoria's queryOptions thing as it is, with aLL it's $expand, $filter, $orderby, $top, $skip goodies.

Though, more thinking should be given to the resourcePath thing. If we base it on the current N2's addressing, than it still needs to be expanded to accomodate all the requirements above. That is, at present N2's URL is pointing to a concrete, single page or item, right? How could we expand it to be able to address say, a .Children property of this page or item, or, say, a certain method, defined in our ContentItem and returning a collection, so it would be subject for Astoria-like $filter-ing and $expan(d)-sion ?

My current idea of expanding N2s' adressing is one somehow inspired by Google Base API. To address properties/methods within a page OR item, i propose to append the /-/<property-name> construct to a URL:

StartPage/myPage/-/Children.aspx

Needless to say, that i became a great proponent of not touching anything in the N2 core. Luckily, this is quite possible now with a recent modifications to allow [ControlsAttribute] for assemblies. So all this URL processing (or rather pre-processing) can occur in the external AspectController. The scneario will be this:
  • if controller finds that a data is requested, than process /-/<remainder> and query arguments (these, starting with $) itself
  • if a usual view is demanded, than strip all "foreign" stuff and delegate path rewriting to a native N2 controller.
What you think, has the above any sense ? Of course, it might work for me, but i'd like to hear some criticism. Sure, i'll continue my experiments, so that something more sensible and less abstract will be available to base an opinion upon.

Now, a little bit about "abstracting ContentItem properties with a providers" as promised earlier.. The idea is to introduce a notion of "axis" or "dimension". Nothing is new, really -- this is already present implicitly in N2: we might well consider a given item hierarchy to be a 2-dimensional space with 2 axis: "parent-children" and "previons-next siblings". Some other, less obvious, that come to mind are: "earlier-later version", "item access permission by user/role". Another example from my N2.Workflow experience could be a "workflow state change history for item" dimension. Some example URIs:

/myItem/-/$workflow.aspx?$format=json&$filter=Published gt 20090310
/myItem/-/$revisions/HEAD.aspx?$format=json

Here, both $workflow and $revisions could be a virtual properties, processed by a provider.

upd: yet another example of an "axis"/"dimension" candidate is "personalization" -- this may share an ASP.Net web parts approach, so that ContentItems properties decorated with a [PersonalizableAttribute] could be persisted on a per user basis.
Coordinator
Mar 10, 2009 at 9:36 PM
You know what? This could be useful when addressing versions of items. So far I've been using id:s for that, which is an itch I've been wanting to scratch since a while.

Other than that the idea looks very restful and would open up for possiblities beyond my imagination. An interesting concept. Is there a relationship to the explorer app?
Mar 11, 2009 at 5:06 PM
oh, well, i haven't thought about N2.Explorer specifically (frankly, still cannot find any use for that spontaneous mind outspring ;-).. Probably, it's more related to another itch i'm trying to pursue for a long time already: it's a concept of "widgets" -- that is, some autonomous pieces of javascript/markup which could function on it's own in a context of another site (well, lemme tell you that i'm under strong influence of  netvibes for a long time, so you probably can imagine what i'd like to achieve). It would be great if these autonomous pieces could be backed by a data, provided by N2. It would be even better if the API to access that data would be similar to what we have now on the server. That's why i'm musing about "unified URLs" for so long.. For now, i'd be quite satisfied to achive just a read-only data access. But, of course, having a full-blown RESTful data service could enable even more interesting scenarios, including desktop clients.

okay, so today i dug in the code and instantly stumbled upon the question asked here many times: "how do i do custom URL rewriting?".. Namely, i need to preprocess an URL in a form
<path-to-item>/-/<resource-path>.aspx?queryOptions
to something N2 would understand :
<path-to-item>.aspx?queryOptions
It cannot be achieved in an RequestAspectController alone, can it ? I guess this pre-processing should be done before url => PathData resolution, but the processing result must be available in the request controller. Some ideas to realise it that come to mind:

1) Implement a custom IUrlParser and tweak it's ResolvePath(..) to pre-process url and somehow return control to existing infrastructure (not an easy thing: all these caching decorator, multi-site parser, etc..)
2) Override a PathData.RewriteItemUrl and handle UrlParser.NotFound to retry path resolution after pre-processing.

Event if any of the two methods would work, than the question arise: how to store preprocessing data (at least that <resource-path> string) so that it could be available later inside a custom request aspect controller ?

Well, rather much to think about ..
Coordinator
Mar 11, 2009 at 10:31 PM
Could the IPathFinder interface (as implemented by the template attribute) be a solution to this? E.g. using [Template("-", "~/path/to.aspx")] would result in a pathdata with the remaining url as an argument that could be interpreted somehow.
Mar 12, 2009 at 11:56 PM
i'm afraid it wouldn't be possible to use [Template] for already existing site, or .. ?
Coordinator
Mar 13, 2009 at 9:28 PM
Good point. This could be changed to work in similarly to the Controls attribute, i.e. defined at assembly level.
Mar 13, 2009 at 10:42 PM
A-ha! Then we're back to the question posed earlier: "how to apply a custom template to existing item" (btw, MagDev's advice about Adapters worked perfectly, especially for dynamically created items within zones..). Okay, so if  you say it's correct to use [Template] for this, then i owe you a patch ;-)
Apr 13, 2009 at 12:41 PM
An intriguing reading on CMS interoperability / content syndication / web services from Drupal circles. Should N2 folks take care of CMIS too ?