Saturday, September 1, 2018

RepoDb: .Net Lightweight ORM Library Extreme Performance

What is RepoDb?

A dynamic, lightweight, and fast repository-based .Net ORM Library.

Packagehttps://www.nuget.org/packages/RepoDb
Projecthttps://github.com/RepoDb/RepoDb
Documentationhttps://repodb.readthedocs.io/en/latest/

RepoDb v1.5.2 Result:

Individual Fetches:













Set Fetches:










RepoDb Performance

Last July 2018, I have posted an initial thread for RepoDb at Reddit and claiming that our library is the fastest one. The thread can be found here (https://www.reddit.com/r/csharp/comments/8y5pm3/repodb_a_very_fast_lightweight_orm_and_has_the/).

Many redditors commented and exchanged words with us about the library, specially with its performance, stability, purpose, features, syntax, differentiator etc etc.

We know that our IL is very fast, and that's true that RepoDb was the fastest in a big-mapping objects (of like 1 million rows). However, the community suggested to test the library using the existing performance bench(ers) that is commonly used by the community to actually test the performance of the ORM library.

With this, we used the benchmarker tool of FransBouma to test how's the performance of our library when compared to other.

Initial Performance Test Result with FransBouma's Tool










It was a shame to us claiming that RepoDb was the fastest ORM .Net Library. It was personally my fault for not executing the proper benchmarking when it comes to performance before it to everyone.

The reason why RepoDb was slow in the result above, was because the performance benchmark tool are using the iterative ways to compare the performance of every ORM. During our development, we never considered this approach.

The version of RepoDb by this time was v1.2.0.

Improving the Performance of IL

First, we analyze the cause of the performance flaws, whether it is the IL or the actual reflection procedure we had. In the beginning, I saw that I did not cache the IL statically, though I am caching it in as per-call basis.

With this, we first cache the IL statically by adding this logic.

First logic:










After logic:












Code Level:

Created a new class named DelegateCache.

public static class DelegateCache
{
    ...
}
In our class DataReaderConverter, we used the the newly created class above to get the corresponding delegate for our data reader's (it is a pre-compiled IL-written delegate).

The approach above significantly improve the performance of RepoDb, however, we still have the flaws when it comes to memory usage. We are aware that we are heavy with the C# reflection.

Improving the Performance of Reflection

Secondly, we targetted to cache the reflected objects. We aim to make sure that we only call the typeof(Entity).GetProperties() once all throughout the lifetime of the library, per class level.

What we did is we introduce a class named PropertyCache to cache the call per class. Secondly, we added class named ClassExpression to pre-compile the GetProperties() operation via Expression Lambda so the next time we call it, it is already compiled.

    public static class PropertyCache
    {
        public static IEnumerable<ClassProperty> Get<TEntity>(Command command = Command.None)
            where TEntity : class
        {
            ...
        }
    }

We also created a class named ClassProperty that will contain a PropertyInfo object and necessary properties and methods to cache the definition of the property. We implemented the IEquatable<ClassProperty> to make sure that the collection objects can maximize the performance of the comparison.

    public class ClassProperty : IEquatable<ClassProperty>
    {
        ...
    }

Here is our way on simply caching the definition at instance level.













Notice the checking of m_isPrimaryAttributeWasSet variable, if this is set to true already, this means that the call into this method is done already, even the result to the m_primaryAttribute property is null.

We did the same to other definition methods all throughout the class. The actual class can be found here (https://github.com/RepoDb/RepoDb/blob/master/RepoDb/RepoDb/ClassProperty.cs).

And since we know that the call to GetProperties() will only happen once per class after we defined the PropertyCache class, then we are sure that the memory will be minimize here, as we have already removed the recurrent operation on this reflection approach.

We are all set already with the implementation above, however, this was still not enough until we cache the actual activity of the caller (actual project that references RepoDb). With this, we came up an idea to cache the command text.

Caching the CommandTexts

As we know that caching the outside calls would improve a lot the performance of the library as it would actually bypass all the operations we have mentioned above (earlier on this blog).

With this, we first implemented the requests classes as you see below.

  • QueryRequest for Query
  • InsertRequest for Insert
  • DeleteRequest for Delete
  • UpdateRequest for Update
  • etc

In every class defined above, it accepts all the parameters the outside calls has in placed. This is to make sure that we are using the passed-values as a key to the uniqueness of the command texts that we are going to cache.

Identifying the Differences of the Parameter Values

We used to override the GetHashCode()Equals() and implemented the IEquatable<T> interface to override the equality comparer of the following classes.

  • All Request Classses
  • ClassProperty
  • QueryField
  • Field
  • Parameter
  • QueryGroup

It enable us to identify and define the correct equality of the object (internally to RepoDb only).

Inside the library, we forced the equality, let's say the FieldA with name equals to "Name" is equal to the instance of FieldB with name equals "Name" and so forth. The logic is very simple with below's code.

    public override int GetHashCode()
    {
return Name.GetHashCode();
    }

    public override bool Equals(object obj)
    {
return GetHashCode() == obj?.GetHashCode();
    }

    public bool Equals(Field other)
    {
return GetHashCode() == other?.GetHashCode();
    }

    public static bool operator ==(Field objA, Field objB)
    {
if (ReferenceEquals(null, objA))
{
return ReferenceEquals(null, objB);
}
return objA?.GetHashCode() == objB?.GetHashCode();
    }

    public static bool operator !=(Field objA, Field objB)
    {
return (objA == objB) == false;
    }

The actual class can be found here (https://github.com/RepoDb/RepoDb/blob/master/RepoDb/RepoDb/Field.cs).

Caching Process for CommandText

Lastly, we introduced a class named CommandTextCache that holds the cached command text of the caller. See below the implementation of one of the method.

internal static class CommandTextCache
{
    private static readonly ConcurrentDictionary<BaseRequest, string> m_cache = new ConcurrentDictionary<BaseRequest, string>();

    public static string GetBatchQueryText<TEntity>(BatchQueryRequest request)
            where TEntity : class
    {
        var commandText = (string)null;
        if (m_cache.TryGetValue(request, out commandText) == false)
        {
            commandText = <codes to get the BatchQuery command text>;
            m_cache.TryAdd(request, commandText);
        }
        return commandText;
    }
}


Let us say, somebody tried to call the repository's Query method as below.

using (var repository = new DbRepository<SqlConnection>(connectionString))
{
    repository.Query<Person>(new { Id = 10220 });
}

The suppose command text is below.

SELECT [Id], [Name], [Address], [DateOfBirth], [DateInsertedUtc], [LastUpdatedUtc] FROM [dbo].[Person];

Inside RepoDb, the method repository Query method has a created a new QueryRequest object with the parameters defined by the caller. In this case is (new { Id = 10220 }).

Then we simply call the CommandText.GetQueryText(queryRequest) get the cached command text.

RepoDb Final Results

There is 2 way of calling the operations in RepoDb, persistent connection and with non-persistent connection. There is 2 way as well on how to do the query, object-based and raw-sql based.

The result below is only for RawSql approach as we have never injected the Object-Based approach. (Note: RawSql is always faster than the Object-Based). This result was personally executed by FransBouma on their Test Environment (with Release binaries version).

Individual fetches:











Set fetches:



The version of RepoDb by this time is v1.5.3.

Kindly share your thoughts, comments and inputs, do not forget to tag me if you would like an immediate response. Thank you for reading this blog!

Friday, June 8, 2018

Apology for 5 years absence!

I would like to apologize for being absent for the last 5 years. There are personal problems that keeps me out of focus on my technical stuffs blogging career.

I will do my best to become more active and participated well on .Net communities.

By the way, after 5 years, we barely become more stronger when it comes to Microsoft Programming and we should be focusing mostly on the usual business and industry problem, and will focus mostly on cloud and big data computing.

This blog will soon be active again. Stay tuned!

Sunday, February 24, 2013

Enabling SQL Server Service Broker

Below is the common script we used to enable the Service Broker in SQL Server 2008.

ALTER DATABASE <DatabaseName> SET ENABLE_BROKER;

Example:


ALTER DATABASE Northwind SET ENABLE_BROKER;

If you feel your database has an outstanding open connection, you have to clear it first before running the script. For you to terminate all connection, you have to set the current database to be single-user and call the rollback keyword to terminate it all. After executing the script, set back the database to be a multi-user database.

Example script:

ALTER DATABASE Northwind SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
GO
ALTER DATABASE Northwind SET ENABLE_BROKER;
GO
ALTER DATABASE Northwind SET MULTI_USER;
GO

Microsoft Documentation: Please visit this link for more information about SQL Server Service Broker.

Sunday, February 10, 2013

Querying Database Objects in SQL Server

The SQL Server contains a built-in schema called [sys] which can be used to query all the server objects created on the current database. This schema provides all the information used by each object of the database such as indexes, columns/fields, tables and views.

This is commonly used by other developer to actually developed an auto generator tool for their architecture.

For Microsoft documentation, please visit this link.

See below how to query the SQL Server Objects from the SQL Server.

Querying Objects

Inside the [sys] schema, if your SQL Server has intellisense, you can see all the schema definition objects if you put a dot after [sys] keyword. See below the screenshot.



You can select what schema object you may query for. We can use a default SELECT query to do a query.

Let say for example, we can query the tables, views, stored procedures and functions with the use of sys.objects catalog.

Querying Tables

We can use the
sys.objects catalog to query the list of tables from the database. See sample code below.

SELECT object_id
      , name
      , type
      , type_desc
      , create_date
      , modify_date
FROM sys.objects
WHERE type = 'u'
ORDER BY name;

Type 'U' defined the object as the USER_TABLE. If we want to query the custom stored procedures, then we can filter the type 'P'.

Querying Fields

We can use the sys.columns catalog to query all the columns from the database. After querying the columns, we can use the object_id field to determine what table it is belong to. See sample below.

SELECT object_id
      , name
      , column_id
      , max_length
      , user_type_id
      , system_type_id
FROM sys.columns;

Joining the table and only querying the custom table columns.

SELECT o.object_id

      , o.name as tablename

      , c.name as columnname
      , c.column_id
      , c.max_length
      , c.user_type_id
      , c.system_type_id
FROM sys.columns c
INNER JOIN sys.objects o ON o.object_id = c.object_id
WHERE o.type = 'U'
ORDER BY o.name, c.name;

Querying Indexes


Same with the columns, we can use the sys.indexes to query the list of indexes under one table.  See our sample below.

SELECT i.object_id
      , o.name as tablename
      , i.name
      , i.index_id
      , i.type
      , i.type_desc
      , i.is_unique
      , i.is_primary_key
FROM sys.indexes i
INNER JOIN sys.objects o ON o.object_id = i.object_id
WHERE o.type = 'u'
ORDER BY o.name;

Querying the indexed columns by table can be filtered using the sys.index_columns. See below.

SELECT i.object_id
      , o.name as tablename
      , c.name as columnname
      , i.index_id
      , i.index_column_id
      , i.column_id
FROM sys.index_columns i
INNER JOIN sys.objects o ON o.object_id = i.object_id
INNER JOIN sys.columns c ON c.object_id = o.object_id AND c.column_id = i.column_id
WHERE o.type = 'u'
ORDER BY o.name;

Base in your requirements, you can expand and filter more specific objects inside [sys] schema.

Saturday, February 2, 2013

Execute SQL Server Scripts in C#

With this tutorial we will guide you how to execute SQL Server scripts from C#.Net. This topic is more about SMO or SQL Server Management Objects.

For your reference regarding SMO, please visit Microsoft documentation.

What is SQL Server Management Object?

The SQL Server Management Object is a set of API developed by Microsoft so that any object manipulations in SQL Server can also be done in the client. This allow other developer to develop more a dynamic Query or Class generator architecture.

Let us start with the set of procedure below.

First, in your C# project, add a reference to the list of DLL below.
  • Microsoft.SqlServer.Management.Sdk.Sfc
  • Microsoft.SqlServer.Smo
  • Microsoft.SqlServer.SmoExtended
  • Microsoft.SqlServer.SqlEnum
Folder Location: C:\Program Files\Microsoft SQL Server\100\SDK\Assemblies\

We will most likely only use 2 binary there (just add other for your future development).

For you to make your SQL scripts runnable in the client, you should test it first in the SQL Management Studio whether there are no syntax error existed. If you feel that the script is right then you're ready to go with the client manipulation.

Stored Procedure

Suppose you have a database named Northwind and you have a table named User (userid, name, email, createddate); then we will create a sample stored procedure for that table.

With our sample table User, we will create a script to get the current user based on the UserID parameter. See below our sample script.

GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[GetUser_sp]
(
      @UserID BIGINT
)
AS
BEGIN
      SELECT [UserID]
      , [Name]
      , [Email]
      , [CreatedDate]
      FROM [dbo].[User]
      WHERE ([UserID] = @UserID);
END

Ensure that the script above is running successfully in the SQL Management Studio. If you found any problem or error (script problem or syntax error) then fix it before executing it in the client.

.NET SMO Execution

We will now guide you how to execute it in the client. Now, go back to our C# project solution and do some code snippet.

First add a reference to the namespaces Microsoft.SqlServer.Management.Smo and Microsoft.SqlServer.Management.Common to your class above. See below.

using Microsoft.SqlServer.Management.Smo;
using Microsoft.SqlServer.Management.Common;

Then, create a SqlConnection object that connects to your database. See below.

using (var connection = new SqlConnection(this.ConnectionString))
{
   
}

Inside the using block, create a Server and Database object. See below our codes.

using (var connection = new SqlConnection(this.ConnectionString))
{
    var server = new Server(new ServerConnection(connection));
    var database = server.Databases[connection.Database];
}

In the Server object ConnectionContext property, we need to set the property AutoDisconnectMode to NoAutoDisconnect for it to not disconnect when there are existing pooling operation in the database.

After that, call the Connect method to connect on the server and then call the ExecuteNonQuery method passing the string of our SQL Scripts. Please make sure to disconnect the connection once executed.

Now, our new code is below.

using (var connection = new SqlConnection(this.ConnectionString))
{
    var server = new Server(new ServerConnection(connection));
    var database = server.Databases[connection.Database];
    server.ConnectionContext.AutoDisconnectMode = AutoDisconnectMode.NoAutoDisconnect;
    server.ConnectionContext.Connect();
    server.ConnectionContext.ExecuteNonQuery("SQL SCRIPTS HERE");
    server.ConnectionContext.Disconnect();
}

Note: If you are running the 4.0 version of .NET and if you however encountered an exception regarding version compatibility. You need to support the 2.0 version of .NET during start up. To do this, you have to modify some settings from your config file. See below.

<?xml version="1.0"?>
<configuration>
  <startup useLegacyV2RuntimeActivationPolicy="true">
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/>
  </startup>
</configuration>

The useLegacyV2RuntimeActivationPolicy will do the trick.

That's all. Have a happy coding.