Friday, June 6, 2008

Six C# features Java developers will kill for...

Using GetHashCode and Equals

The ability to override GetHashCode is available on every object but is seldom required for POCOs. The Equals method usually provides all of the comparison functionality we'll ever need. But when using an ORM such as NHibernate, GetHashCode takes a more prominent role as it helps NHibernate determine if an object already exists in a collection. Not overriding GetHashCode, or doing so inappropriately, may lead to duplicate objects showing up in collections and/or objects missing altogether. When needed, most people implement both methods and end up with similar code in both. So to ease the burden of managing both of these methods, there's exploitable overlap between Equals and GetHashCode to kill two birds with one stone.


In my project work, I consider the following to be true when comparing two objects:

* If two objects have IDs, and have the same IDs, then they may be considered equal without examining them further. (I'm assuming ID to be an identity field, or equivalent, in the DB.)
* If two objects have IDs, but have different IDs, than they may be considered not equal without examining them further. E.g., If customer A has an ID of 4 while customer B has an ID of 5, then they are not equal, QED.
* If neither object has an ID or only one of them has an ID, but their "business value signatures" are identical, then they're equal. E.g., customer A has an ID of 4 and a social-security-number of 123-45-6789 while customer B has no ID but also has a social-security-number of 123-45-6789. In this case, customer A and customer B are equal. By "business signatures," I imply a combination of those properties which deem an entity unique, regardless of its ID.

* If one of them is null, then they are not equal. (Whoo, that was easy.)

With the above considerations in mind, we'll want to write code to make the following unit test pass. Note that Customer takes a company name into its constructor. It also has a settable contact name. The combination of its company name and contact name give Customer its unique business signature.

view plaincopy to clipboardprint?

1. [Test]
2. public void CanCompareCustomers() {
3. Customer customerA = new Customer("Acme");
4. Customer customerB = new Customer("Anvil");
5.
6. Assert.AreNotEqual(customerA, null);
7. Assert.AreNotEqual(customerA, customerB);
8.
9. customerA.SetIdTo(1);
10. customerB.SetIdTo(1);
11.
12. // Even though the signatures are different,
13. // the persistent IDs were the same. Call me
14. // crazy, but I put that much trust into IDs.
15. Assert.AreEqual(customerA, customerB);
16.
17. Customer customerC = new Customer("Acme");
18.
19. // Since customerA has an ID but customerC
20. // doesn't, their signatures will be compared
21. Assert.AreEqual(customerA, customerC);
22.
23. customerC.ContactName = "Coyote";
24.
25. // Signatures are now different
26. Assert.AreNotEqual(customerA, customerC);
27.
28. // customerA.Equals(customerB) because they
29. // have the same ID.
30. // customerA.Equals(customerC) because they
31. // have the same signature.
32. // customerB.DoesNotEquals(customerC) because
33. // we can't compare their IDs, since
34. // customerC is transient, and their
35. // signatures are different.
36. Assert.AreNotEqual(customerB, customerC);
37. }

[Test] public void CanCompareCustomers() { Customer customerA = new Customer("Acme"); Customer customerB = new Customer("Anvil"); Assert.AreNotEqual(customerA, null); Assert.AreNotEqual(customerA, customerB); customerA.SetIdTo(1); customerB.SetIdTo(1); // Even though the signatures are different, // the persistent IDs were the same. Call me // crazy, but I put that much trust into IDs. Assert.AreEqual(customerA, customerB); Customer customerC = new Customer("Acme"); // Since customerA has an ID but customerC // doesn't, their signatures will be compared Assert.AreEqual(customerA, customerC); customerC.ContactName = "Coyote"; // Signatures are now different Assert.AreNotEqual(customerA, customerC); // customerA.Equals(customerB) because they // have the same ID. // customerA.Equals(customerC) because they // have the same signature. // customerB.DoesNotEquals(customerC) because // we can't compare their IDs, since // customerC is transient, and their // signatures are different. Assert.AreNotEqual(customerB, customerC); }

Although some argue against a single object which all other persistable domain objects inherit from, I use one nonetheless and ingeniously call it "DomainObject." (Those are Dr. Evil quotes there.) "DomainObject," in its entirety, contains the following:

view plaincopy to clipboardprint?

1. public abstract class DomainObject
2. {
3. ///
4. /// ID may be of type string, int,
5. /// custom type, etc.
6. ///

7. public IdT ID {
8. get { return id; }
9. }
10.
11. public override sealed bool Equals(object obj) {
12. DomainObject compareTo =
13. obj as DomainObject;
14.
15. return (compareTo != null) &&
16. (HasSameNonDefaultIdAs(compareTo) ||
17. // Since the IDs aren't the same, either
18. // of them must be transient to compare
19. // business value signatures
20. (((IsTransient()) || compareTo.IsTransient()) &&
21. HasSameBusinessSignatureAs(compareTo)));
22. }
23.
24. ///
25. /// Transient objects are not associated with an
26. /// item already in storage. For instance, a
27. /// Customer is transient if its ID is 0.
28. ///

29. public bool IsTransient() {
30. return ID == null || ID.Equals(default(IdT));
31. }
32.
33. ///
34. /// Must be implemented to compare two objects
35. ///

36. public abstract override int GetHashCode();
37.
38. private bool HasSameBusinessSignatureAs(DomainObject compareTo) {
39. return GetHashCode().Equals(compareTo.GetHashCode());
40. }
41.
42. ///
43. /// Returns true if self and the provided domain
44. /// object have the same ID values and the IDs
45. /// are not of the default ID value
46. ///

47. private bool HasSameNonDefaultIdAs(DomainObject compareTo) {
48. return (ID != null &&
49. ! ID.Equals(default(IdT))) &&
50. (compareTo.ID != null &&
51. ! compareTo.ID.Equals(default(IdT))) &&
52. ID.Equals(compareTo.ID);
53. }
54.
55. ///
56. /// Set to protected to allow unit tests to set
57. /// this property via reflection and to allow
58. /// domain objects more flexibility in setting
59. /// this for those objects with assigned IDs.
60. ///

61. protected IdT id = default(IdT);
62. }

public abstract class DomainObject { /// /// ID may be of type string, int, /// custom type, etc. /// public IdT ID { get { return id; } } public override sealed bool Equals(object obj) { DomainObject compareTo = obj as DomainObject; return (compareTo != null) && (HasSameNonDefaultIdAs(compareTo) || // Since the IDs aren't the same, either // of them must be transient to compare // business value signatures (((IsTransient()) || compareTo.IsTransient()) && HasSameBusinessSignatureAs(compareTo))); } /// /// Transient objects are not associated with an /// item already in storage. For instance, a /// Customer is transient if its ID is 0. /// public bool IsTransient() { return ID == null || ID.Equals(default(IdT)); } /// /// Must be implemented to compare two objects /// public abstract override int GetHashCode(); private bool HasSameBusinessSignatureAs(DomainObject compareTo) { return GetHashCode().Equals(compareTo.GetHashCode()); } /// /// Returns true if self and the provided domain /// object have the same ID values and the IDs /// are not of the default ID value /// private bool HasSameNonDefaultIdAs(DomainObject compareTo) { return (ID != null && ! ID.Equals(default(IdT))) && (compareTo.ID != null && ! compareTo.ID.Equals(default(IdT))) && ID.Equals(compareTo.ID); } /// /// Set to protected to allow unit tests to set /// this property via reflection and to allow /// domain objects more flexibility in setting /// this for those objects with assigned IDs. /// protected IdT id = default(IdT); }

Note that Equals is sealed and cannot be overridden by a DomainObject implementation. I suppose it could be unsealed, but since I put a lot of work into that method, I don't want anyone mucking it up!

Now assume that Customer implements DomainObject. As mentioned above, the combination of its company name and contact name give it its unique signature. So its GetHashCode would be as follows:

view plaincopy to clipboardprint?

1. public override int GetHashCode() {
2. return (GetType().FullName + "|" +
3. CompanyName + "|" +
4. ContactName).GetHashCode();
5. }

public override int GetHashCode() { return (GetType().FullName + "|" + CompanyName + "|" + ContactName).GetHashCode(); }

You'll notice that the start of the method includes the full name of the class type itself. With this in place, two different classes would never return the same signature. (You'll have to reconsider how GetHashCode is implemented to handle inheritance structures; e.g. a Customer and an Employee both inherit from a Person class but Customer and Employee may be equal in some instances...for this, I'd probably only add GetHashCode to the Person class.) Additionally, note that GetHashCode should only contain the "business signature" of the object and not include its ID. Including the ID in the signature would make it impossible to find equality between a transient object and a ! transient object. (Equality for all regardless of transience I say!)

Pramod Gupta

Thursday, May 22, 2008

Bulk Binding: FORALL

Bulk Binding: FORALL
When writing your PL/SQL stored procedure bear in mind that SQL statements are still sent to the SQL Engine as opposed to the PL/SQL Engine. Therefore in cases where you are executing several SQL statements in a loop, the resulting context switches could cause a noticeable performance problem.

One solution to this performance solution is the use of "bulk binding." What is that? Well first, you may recall that binding variables allows you to tie the current value of a variable into an SQL statement.
http://thinkoracle.blogspot.com/2005/06/bind-variables-in-plsql.html

"Bulk Binding" refers to a process whereby you can tie the current values of all the elements in an entire collection into a single operation. By using bulk binds, only one context switch is made between the PL/SQL and SQL Engines, to pass the entire collection, thus avoiding those performance issues.

So how is this done? In this case, using FORALL.

Let's look at an example from the Oracle documentation of how you might do something without any knowledge of bulk binding:

DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FOR i IN depts.FIRST..depts.LAST LOOP
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);
END LOOP;
END;

Using timing techniques I've explained before, here is what I came up with:
http://thinkoracle.blogspot.com/2005/09/analyzing-query-performance.html

call count cpu elapsed disk query
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 3 0.01 0.00 0 9
Fetch 0 0.00 0.00 0 0
------- ------ -------- ---------- ---------- ----------
total 4 0.01 0.00 0 9



Instead, we can use FORALL like this:

DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FORALL i IN depts.FIRST..depts.LAST
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);
END;

And get this:

call count cpu elapsed disk query
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 1 0.00 0.01 0 9
Fetch 0 0.00 0.00 0 0
------- ------ -------- ---------- ---------- ----------
total 2 0.00 0.01 0 9



Notice 1 execute instead of 3. Of course we won't see any performance difference on a small sample size, but here I am just illustrating the point.

You may notice the absense of the keyword "LOOP" in the FORALL example. That is because despite its similar appearances and syntax in this example, FORALL is not a loop. It takes a single SQL statement, and the index i can be used only as an index into the collection.

You can find more information about FORALL in the PL/SQL User's Guide and Reference: http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96624.pdf

Another useful trick is the use of SQL%BULK_ROWCOUNT (SQL is the implicit cursor used by the SQL engine for DML operations). It can be indexed the same way as the collection.

DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FORALL i IN depts.FIRST..depts.LAST
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);

FOR i IN depts.FIRST..depts.LAST LOOP
DBMS_OUTPUT.PUT_LINE('Affected Rows for ' || depts(i) || ' is ' || SQL%BULK_ROWCOUNT(i));
END LOOP;
END;

Affected Rows for 10 is 3
Affected Rows for 30 is 6
Affected Rows for 70 is 0

The only error I can foresee getting with FORALL is this one:
ORA-22160: element at index does not exist
And you will only get that if you are somehow binding to an index that does not exist.

You may also get complaints if you use the index in an expression, which is not supported:
PLS-00430: FORALL iteration variable i is not allowed in this context

Now that I've spoken about avoiding unnecessary context switches executing SQL statements, what about unnecessary context switches in getting information FROM the SQL engine? Are those avoidable? Yes, using BULK COLLECT. I'll write more about that shortly.

Links:
Dan Morgan has more than just a reference on this topic, he has lots of really good examples: http://www.psoug.org/reference/bulk_collect.html

There is also a short but good write-up on bulk binding in the Application Developer's Guide: Fundamentals, look under Chapter 9:
http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96590.pdf

Dr. Tim Hall has some several good write-ups with good examples: http://www.oracle-base.com/articles/8i/BulkBinds8i.php

Tuesday, May 6, 2008

SQL Server 2005 | New XML Capabilities

SQL Server 2005 | New XML Capabilities: "Take Five with SQL Server 2005"

Take Five with SQL Server 2005

Take Five with SQL Server 2005

New XML Capabilities in SQL Server 2005
Kevin S. Goff, Microsoft MVP 2007

kick it on DotNetKicks.com

This is the first installment in a monthly, ten-part series on new developer features in SQL Server 2005. The objective of this series is to provide software developers with quick information and meaningful examples that demonstrate new capabilities in SQL 2005. Many developers can testify that being able to effectively utilize the tools in SQL Server can increase the value they bring to a software project. This series will cover the following:

  • New XML capabilities (subject of this article)
  • A three-part series on new T-SQL 2005 language features
  • SQL Server 2005 Service Broker
  • SQL Server 2005 Isolation Levels
  • A two-part series on writing procedures and functions in the .NET CLR, and in-process data access
  • SQL Server 2005 Reporting Services
  • ADO.NET

The title of this series ("Take Five with SQL 2005") reflects the goal of providing useful programming examples in a short period of time. Despite a number of great SQL 2005 books, some developers don’t always have time to study the material in a book. So these articles will focus on code samples that developers can adapt to their applications.

So let’s begin! Microsoft beefed up support of XML in SQL 2005. Here are the items we’ll cover:

  1. The new XML datatype in SQL 2005, and the new associated methods (query, exists, value, and nodes)
  2. A reusable T-SQL UDF to convert XML to a table variable
  3. New SELECT FOR….XML capabilities
  4. Performing partial text searches inside an XML DataColumn


1- XML Datatype and associated methods:

Prior to SQL Server 2005, if developers wanted to convert XML data to relational data, they had to use combinations of the stored procedure sp_xml_preparedocument and the OPENXML function. While still valid, this methodology introduces some overhead. SQL 2005 provides native support for the XML data type, and new methods to directly parse and read the data.

Listing A provides a demonstration. The code creates a basic XML string of names and cities, and uses the nodes method to create a row for each instance of the Customer node. We must provide a table/column name combination to use the results, so we’ll call it tab(col). In the SELECT statement, the code will use the value method to strip out pieces of the tab(col) structure. Finally, the query uses the exist method, which evaluates a comparison condition and returns either a 0 or 1. Note that the syntax for exist doesn’t allow us to concatenate a search variable – but we can use the sql:variable syntax to embed any variable or parameter.

Listing A
(using nodes to convert XML to relational data)

-- using the XML Nodes methd to turn XML data

-- into Relational Data

DECLARE @XMLString XML

SET @XMLString ='

Kevin

Goff

Camp Hill

Steve

Goff

Philadelphia

'

-- use the basic query method

SELECT @XMLString.query('/Customers/Customer')

-- Now query using the nodes method

-- Read the columns using the value method, and set a condition using exist

-- Set a search condition

DECLARE @cLookup VARCHAR(50)

SET @cLookup ='Camp Hill'

SELECT

tab.col.value('FirstName[1]','VARCHAR(20)')AS FirstName,

tab.col.value('LastName[1]','VARCHAR(20)')AS LastName,

tab.col.value('City[1]','VARCHAR(20)')AS City

FROM @XMLString.nodes('//Customer') tab(col)

WHERE tab.col.exist('City/text()[.= sql:variable("@cLookup")] ')= 1


2 - A reusable T-SQL UDF to convert XML to a table variable

Many database applications execute queries based on a variable number of selections. For example, a product report might prompt the user with a menu of dozens or even hundreds of products. The user selects any number of products, and the database must query against the information for those products.

Over the years, developers have come up with different ways to write a stored procedure that accepts a variable number of selections. Some developers will pass a comma-delimited string of integer PK values representing the selections, write a UDF in T-SQL to convert the string to a table variable, and then query against the table variable. Other developers will pass an XML string of selections, and then use sp_xml_preparedocument and OPENXML to convert the XML selections to a table variable. The problem with the latter method is that SQL Server doesn’t allow the use of sp_xml_preparedocument inside a UDF, making it difficult to write a generic, reusable method.

Once again, the new XML features in SQL 2005 make this task easier, faster, and less resource intensive. If you create an XML string [or an .NET Dataset/object that can serialize as XML] with a column called PKValue that stores user-selections, you can write a UDF like the one in Listing B, to convert the XML string to a table variable, using the node method previously covered in this article. Listing B contains both a UDF as well as a sample to utilize it.

Listing B – UDF XML2TableVar
Converts an XML string with a node of PKValue to a table variable

CREATEFUNCTION [dbo].[XML2TableVar]

( @XMLString XML)

RETURNS

@tPKList TABLE ( PKValue int)

AS

BEGIN

INSERTINTO @tPKList

SELECT Tbl.Col.value('.','int')

FROM @XMLString.nodes('//PKValue') Tbl( Col )

RETURN

END

-- Example Usage

-- Create an XML string to use in a subsequent join against a customer table

DECLARE @XMLString XML

SET @XMLString ='

2

5

'

-- Show the contents

SELECT *FROM [dbo].[XML2TableVar] (@XMLString) CustList

-- Perform a JOIN using the UDF

SELECT Customers.*FROM Customers

JOIN [dbo].[XML2TableVar] (@XMLString) CustList

ON Customers.CustomerPK = CustList.PKValue


3 - New SELECT FOR….XML capabilities

Listing C shows examples of querying to XML format, including the ability to specify an attribute-centric or element-centric format.

Listing C
(SELECT…FOR XML capabilities)

-- attribute-centric schema

SELECT*FROM orders WHERE OrderID = 10447

FORXMLAUTO,XMLSCHEMA

-- element-centric schema

SELECT*FROM orders WHERE OrderID = 10447

FORXMLAUTO, ELEMENTS ,XMLSCHEMA


4 - Performing partial text searches inside an XML DataColumn

Last, but certainly not least – some database store information in an XML format. For example, some will store address information inside an XML column. It can become a bit of a challenge to query that data, especially if end users want the ability to perform partial text searches.

Once again, we can use the exist method and clauses within the exist method to perform these types of queries. Listing D creates a simple address XML column, populates some sample data, and then performs a partial text query as follows:

WHERE Address.exist('/Address/AddressRecord [contains(@City,"burg")]')= 1

Listing D
(Performing partial text searches)

DECLARE @tTestAddress TABLE(Address XML)

INSERTINTO @tTestAddress VALUES('

CustomerID = "1"

Street=" 1945 14th Ave " City=" Atlantic City " State="NJ" Zip="91912" />

')

insertinto @tTestAddress values('

CustomerID = "2"

Street=" 27 Main Street " City=" Harrisburg " State="PA" Zip="44555"/>

')

SELECT*FROM @tTestAddress WHERE

Address.exist('/Address/AddressRecord [contains(@City,"burg")]')= 1


Recommended Reading :

For every piece of information that an article covers, there are always multiple items that aren’t covered. That’s a testament to how much XML functionality exists in SQL 2005. There are new capabilities for OPENROWSET, a modify method in the XML data type to manipulate XML data, and so much more. The objective of this article was to provide a starting point with some examples of the new XML. Here are some additional resources that provide additional information.

Remember…GOOGLE is your friend. Sometimes just taking thirty minutes to Google some XML keywords will lead to more valuable information than you ever thought possible.

Writing CLR Stored Procedures in C#

Writing CLR Stored Procedures in C#: "Writing CLR Stored Procedures in C#

As SQL Server 2005 rolls out DBA's are going to be forced to learn either C# or Visual Basic or both. Until now these were client side languages and not knowing them had little impact on your job. And if you write code in these languages your going to have to learn to use Visual Studio. This article covers the basics of C# and Visual Studio using a basic stored procedure as an example. It's written using the April CTP of SQL Server 2005.

This article shows you how http://www.sqlteam.com/item.asp?ItemID=21927"

Writing CLR Stored Procedures in C# - Introduction to C# (Part 1) - SQLTeam.com

Writing CLR Stored Procedures in C# - Introduction to C# (Part 1) - SQLTeam.com