Saturday, June 14, 2008
Friday, June 6, 2008
Using GetHashCode and Equals
In my project work, I consider the following to be true when comparing two objects:
* If two objects have IDs, and have the same IDs, then they may be considered equal without examining them further. (I'm assuming ID to be an identity field, or equivalent, in the DB.)
* If two objects have IDs, but have different IDs, than they may be considered not equal without examining them further. E.g., If customer A has an ID of 4 while customer B has an ID of 5, then they are not equal, QED.
* If neither object has an ID or only one of them has an ID, but their "business value signatures" are identical, then they're equal. E.g., customer A has an ID of 4 and a social-security-number of 123-45-6789 while customer B has no ID but also has a social-security-number of 123-45-6789. In this case, customer A and customer B are equal. By "business signatures," I imply a combination of those properties which deem an entity unique, regardless of its ID.
* If one of them is null, then they are not equal. (Whoo, that was easy.)
With the above considerations in mind, we'll want to write code to make the following unit test pass. Note that Customer takes a company name into its constructor. It also has a settable contact name. The combination of its company name and contact name give Customer its unique business signature.
view plaincopy to clipboardprint?
1. [Test]
2. public void CanCompareCustomers() {
3. Customer customerA = new Customer("Acme");
4. Customer customerB = new Customer("Anvil");
5.
6. Assert.AreNotEqual(customerA, null);
7. Assert.AreNotEqual(customerA, customerB);
8.
9. customerA.SetIdTo(1);
10. customerB.SetIdTo(1);
11.
12. // Even though the signatures are different,
13. // the persistent IDs were the same. Call me
14. // crazy, but I put that much trust into IDs.
15. Assert.AreEqual(customerA, customerB);
16.
17. Customer customerC = new Customer("Acme");
18.
19. // Since customerA has an ID but customerC
20. // doesn't, their signatures will be compared
21. Assert.AreEqual(customerA, customerC);
22.
23. customerC.ContactName = "Coyote";
24.
25. // Signatures are now different
26. Assert.AreNotEqual(customerA, customerC);
27.
28. // customerA.Equals(customerB) because they
29. // have the same ID.
30. // customerA.Equals(customerC) because they
31. // have the same signature.
32. // customerB.DoesNotEquals(customerC) because
33. // we can't compare their IDs, since
34. // customerC is transient, and their
35. // signatures are different.
36. Assert.AreNotEqual(customerB, customerC);
37. }
[Test] public void CanCompareCustomers() { Customer customerA = new Customer("Acme"); Customer customerB = new Customer("Anvil"); Assert.AreNotEqual(customerA, null); Assert.AreNotEqual(customerA, customerB); customerA.SetIdTo(1); customerB.SetIdTo(1); // Even though the signatures are different, // the persistent IDs were the same. Call me // crazy, but I put that much trust into IDs. Assert.AreEqual(customerA, customerB); Customer customerC = new Customer("Acme"); // Since customerA has an ID but customerC // doesn't, their signatures will be compared Assert.AreEqual(customerA, customerC); customerC.ContactName = "Coyote"; // Signatures are now different Assert.AreNotEqual(customerA, customerC); // customerA.Equals(customerB) because they // have the same ID. // customerA.Equals(customerC) because they // have the same signature. // customerB.DoesNotEquals(customerC) because // we can't compare their IDs, since // customerC is transient, and their // signatures are different. Assert.AreNotEqual(customerB, customerC); }
Although some argue against a single object which all other persistable domain objects inherit from, I use one nonetheless and ingeniously call it "DomainObject." (Those are Dr. Evil quotes there.) "DomainObject," in its entirety, contains the following:
view plaincopy to clipboardprint?
1. public abstract class DomainObject
2. {
3. ///
4. /// ID may be of type string, int,
5. /// custom type, etc.
6. ///
7. public IdT ID {
8. get { return id; }
9. }
10.
11. public override sealed bool Equals(object obj) {
12. DomainObject
13. obj as DomainObject
14.
15. return (compareTo != null) &&
16. (HasSameNonDefaultIdAs(compareTo) ||
17. // Since the IDs aren't the same, either
18. // of them must be transient to compare
19. // business value signatures
20. (((IsTransient()) || compareTo.IsTransient()) &&
21. HasSameBusinessSignatureAs(compareTo)));
22. }
23.
24. ///
25. /// Transient objects are not associated with an
26. /// item already in storage. For instance, a
27. /// Customer is transient if its ID is 0.
28. ///
29. public bool IsTransient() {
30. return ID == null || ID.Equals(default(IdT));
31. }
32.
33. ///
34. /// Must be implemented to compare two objects
35. ///
36. public abstract override int GetHashCode();
37.
38. private bool HasSameBusinessSignatureAs(DomainObject
39. return GetHashCode().Equals(compareTo.GetHashCode());
40. }
41.
42. ///
43. /// Returns true if self and the provided domain
44. /// object have the same ID values and the IDs
45. /// are not of the default ID value
46. ///
47. private bool HasSameNonDefaultIdAs(DomainObject
48. return (ID != null &&
49. ! ID.Equals(default(IdT))) &&
50. (compareTo.ID != null &&
51. ! compareTo.ID.Equals(default(IdT))) &&
52. ID.Equals(compareTo.ID);
53. }
54.
55. ///
56. /// Set to protected to allow unit tests to set
57. /// this property via reflection and to allow
58. /// domain objects more flexibility in setting
59. /// this for those objects with assigned IDs.
60. ///
61. protected IdT id = default(IdT);
62. }
public abstract class DomainObject
Note that Equals is sealed and cannot be overridden by a DomainObject implementation. I suppose it could be unsealed, but since I put a lot of work into that method, I don't want anyone mucking it up!
Now assume that Customer implements DomainObject. As mentioned above, the combination of its company name and contact name give it its unique signature. So its GetHashCode would be as follows:
view plaincopy to clipboardprint?
1. public override int GetHashCode() {
2. return (GetType().FullName + "|" +
3. CompanyName + "|" +
4. ContactName).GetHashCode();
5. }
public override int GetHashCode() { return (GetType().FullName + "|" + CompanyName + "|" + ContactName).GetHashCode(); }
You'll notice that the start of the method includes the full name of the class type itself. With this in place, two different classes would never return the same signature. (You'll have to reconsider how GetHashCode is implemented to handle inheritance structures; e.g. a Customer and an Employee both inherit from a Person class but Customer and Employee may be equal in some instances...for this, I'd probably only add GetHashCode to the Person class.) Additionally, note that GetHashCode should only contain the "business signature" of the object and not include its ID. Including the ID in the signature would make it impossible to find equality between a transient object and a ! transient object. (Equality for all regardless of transience I say!)
Pramod Gupta
Thursday, May 22, 2008
Bulk Binding: FORALL
When writing your PL/SQL stored procedure bear in mind that SQL statements are still sent to the SQL Engine as opposed to the PL/SQL Engine. Therefore in cases where you are executing several SQL statements in a loop, the resulting context switches could cause a noticeable performance problem.
One solution to this performance solution is the use of "bulk binding." What is that? Well first, you may recall that binding variables allows you to tie the current value of a variable into an SQL statement.
http://thinkoracle.blogspot.com/2005/06/bind-variables-in-plsql.html
"Bulk Binding" refers to a process whereby you can tie the current values of all the elements in an entire collection into a single operation. By using bulk binds, only one context switch is made between the PL/SQL and SQL Engines, to pass the entire collection, thus avoiding those performance issues.
So how is this done? In this case, using FORALL.
Let's look at an example from the Oracle documentation of how you might do something without any knowledge of bulk binding:
DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FOR i IN depts.FIRST..depts.LAST LOOP
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);
END LOOP;
END;
Using timing techniques I've explained before, here is what I came up with:
http://thinkoracle.blogspot.com/2005/09/analyzing-query-performance.html
call count cpu elapsed disk query
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 3 0.01 0.00 0 9
Fetch 0 0.00 0.00 0 0
------- ------ -------- ---------- ---------- ----------
total 4 0.01 0.00 0 9
Instead, we can use FORALL like this:
DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FORALL i IN depts.FIRST..depts.LAST
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);
END;
And get this:
call count cpu elapsed disk query
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 1 0.00 0.01 0 9
Fetch 0 0.00 0.00 0 0
------- ------ -------- ---------- ---------- ----------
total 2 0.00 0.01 0 9
Notice 1 execute instead of 3. Of course we won't see any performance difference on a small sample size, but here I am just illustrating the point.
You may notice the absense of the keyword "LOOP" in the FORALL example. That is because despite its similar appearances and syntax in this example, FORALL is not a loop. It takes a single SQL statement, and the index i can be used only as an index into the collection.
You can find more information about FORALL in the PL/SQL User's Guide and Reference: http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96624.pdf
Another useful trick is the use of SQL%BULK_ROWCOUNT (SQL is the implicit cursor used by the SQL engine for DML operations). It can be indexed the same way as the collection.
DECLARE
TYPE NumList IS VARRAY(20) OF NUMBER;
depts NumList := NumList(10,30,70);
BEGIN
FORALL i IN depts.FIRST..depts.LAST
UPDATE emp SET sal = sal + 100 WHERE deptno = depts(i);
FOR i IN depts.FIRST..depts.LAST LOOP
DBMS_OUTPUT.PUT_LINE('Affected Rows for ' || depts(i) || ' is ' || SQL%BULK_ROWCOUNT(i));
END LOOP;
END;
Affected Rows for 10 is 3
Affected Rows for 30 is 6
Affected Rows for 70 is 0
The only error I can foresee getting with FORALL is this one:
ORA-22160: element at index does not exist
And you will only get that if you are somehow binding to an index that does not exist.
You may also get complaints if you use the index in an expression, which is not supported:
PLS-00430: FORALL iteration variable i is not allowed in this context
Now that I've spoken about avoiding unnecessary context switches executing SQL statements, what about unnecessary context switches in getting information FROM the SQL engine? Are those avoidable? Yes, using BULK COLLECT. I'll write more about that shortly.
Links:
Dan Morgan has more than just a reference on this topic, he has lots of really good examples: http://www.psoug.org/reference/bulk_collect.html
There is also a short but good write-up on bulk binding in the Application Developer's Guide: Fundamentals, look under Chapter 9:
http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96590.pdf
Dr. Tim Hall has some several good write-ups with good examples: http://www.oracle-base.com/articles/8i/BulkBinds8i.php
Thursday, May 8, 2008
Tuesday, May 6, 2008
SQL Server 2005 | New XML Capabilities
Take Five with SQL Server 2005
Take Five with SQL Server 2005
New XML Capabilities in SQL Server 2005
Kevin S. Goff, Microsoft MVP 2007
This is the first installment in a monthly, ten-part series on new developer features in SQL Server 2005. The objective of this series is to provide software developers with quick information and meaningful examples that demonstrate new capabilities in SQL 2005. Many developers can testify that being able to effectively utilize the tools in SQL Server can increase the value they bring to a software project. This series will cover the following:
- New XML capabilities (subject of this article)
- A three-part series on new T-SQL 2005 language features
- SQL Server 2005 Service Broker
- SQL Server 2005 Isolation Levels
- A two-part series on writing procedures and functions in the .NET CLR, and in-process data access
- SQL Server 2005 Reporting Services
- ADO.NET
The title of this series ("Take Five with SQL 2005") reflects the goal of providing useful programming examples in a short period of time. Despite a number of great SQL 2005 books, some developers don’t always have time to study the material in a book. So these articles will focus on code samples that developers can adapt to their applications.
So let’s begin! Microsoft beefed up support of XML in SQL 2005. Here are the items we’ll cover:
- The new XML datatype in SQL 2005, and the new associated methods (query, exists, value, and nodes)
- A reusable T-SQL UDF to convert XML to a table variable
- New SELECT FOR….XML capabilities
- Performing partial text searches inside an XML DataColumn
1- XML Datatype and associated methods:
Prior to SQL Server 2005, if developers wanted to convert XML data to relational data, they had to use combinations of the stored procedure sp_xml_preparedocument and the OPENXML function. While still valid, this methodology introduces some overhead. SQL 2005 provides native support for the XML data type, and new methods to directly parse and read the data.
Listing A provides a demonstration. The code creates a basic XML string of names and cities, and uses the nodes method to create a row for each instance of the Customer node. We must provide a table/column name combination to use the results, so we’ll call it tab(col). In the SELECT statement, the code will use the value method to strip out pieces of the tab(col) structure. Finally, the query uses the exist method, which evaluates a comparison condition and returns either a 0 or 1. Note that the syntax for exist doesn’t allow us to concatenate a search variable – but we can use the sql:variable syntax to embed any variable or parameter.
Listing A
(using nodes to convert XML to relational data)
-- using the XML Nodes methd to turn XML data
-- into Relational Data
DECLARE @XMLString XML
SET @XMLString ='
'
-- use the basic query method
SELECT @XMLString.query('/Customers/Customer')
-- Now query using the nodes method
-- Read the columns using the value method, and set a condition using exist
-- Set a search condition
DECLARE @cLookup VARCHAR(50)
SET @cLookup ='Camp Hill'
SELECT
tab.col.value('FirstName[1]','VARCHAR(20)')AS FirstName,
tab.col.value('LastName[1]','VARCHAR(20)')AS LastName,
tab.col.value('City[1]','VARCHAR(20)')AS City
FROM @XMLString.nodes('//Customer') tab(col)
WHERE tab.col.exist('City/text()[.= sql:variable("@cLookup")] ')= 1
2 - A reusable T-SQL UDF to convert XML to a table variable
Many database applications execute queries based on a variable number of selections. For example, a product report might prompt the user with a menu of dozens or even hundreds of products. The user selects any number of products, and the database must query against the information for those products.
Over the years, developers have come up with different ways to write a stored procedure that accepts a variable number of selections. Some developers will pass a comma-delimited string of integer PK values representing the selections, write a UDF in T-SQL to convert the string to a table variable, and then query against the table variable. Other developers will pass an XML string of selections, and then use sp_xml_preparedocument and OPENXML to convert the XML selections to a table variable. The problem with the latter method is that SQL Server doesn’t allow the use of sp_xml_preparedocument inside a UDF, making it difficult to write a generic, reusable method.
Once again, the new XML features in SQL 2005 make this task easier, faster, and less resource intensive. If you create an XML string [or an .NET Dataset/object that can serialize as XML] with a column called PKValue that stores user-selections, you can write a UDF like the one in Listing B, to convert the XML string to a table variable, using the node method previously covered in this article. Listing B contains both a UDF as well as a sample to utilize it.
Listing B – UDF XML2TableVar
Converts an XML string with a node of PKValue to a table variable
CREATEFUNCTION [dbo].[XML2TableVar]
( @XMLString XML)
RETURNS
@tPKList TABLE ( PKValue int)
AS
BEGIN
INSERTINTO @tPKList
SELECT Tbl.Col.value('.','int')
FROM @XMLString.nodes('//PKValue') Tbl(
RETURN
END
-- Example Usage
-- Create an XML string to use in a subsequent join against a customer table
DECLARE @XMLString XML
SET @XMLString ='
'
-- Show the contents
SELECT *FROM [dbo].[XML2TableVar] (@XMLString) CustList
-- Perform a JOIN using the UDF
SELECT Customers.*FROM Customers
JOIN [dbo].[XML2TableVar] (@XMLString) CustList
ON Customers.CustomerPK = CustList.PKValue
3 - New SELECT FOR….XML capabilities
Listing C shows examples of querying to XML format, including the ability to specify an attribute-centric or element-centric format.
Listing C
(SELECT…FOR XML capabilities)
-- attribute-centric schema
SELECT*FROM orders WHERE OrderID = 10447
FORXMLAUTO,XMLSCHEMA
-- element-centric schema
SELECT*FROM orders WHERE OrderID = 10447
FORXMLAUTO, ELEMENTS ,XMLSCHEMA
4 - Performing partial text searches inside an XML DataColumn
Last, but certainly not least – some database store information in an XML format. For example, some will store address information inside an XML column. It can become a bit of a challenge to query that data, especially if end users want the ability to perform partial text searches.
Once again, we can use the exist method and clauses within the exist method to perform these types of queries. Listing D creates a simple address XML column, populates some sample data, and then performs a partial text query as follows:
WHERE Address.exist('/Address/AddressRecord [contains(@City,"burg")]')= 1
Listing D
(Performing partial text searches)
DECLARE @tTestAddress TABLE(Address XML)
INSERTINTO @tTestAddress VALUES('
CustomerID = "1"
Street="
')
insertinto @tTestAddress values('
CustomerID = "2"
Street="
')
SELECT*FROM @tTestAddress WHERE
Address.exist('/Address/AddressRecord [contains(@City,"burg")]')= 1
Recommended
For every piece of information that an article covers, there are always multiple items that aren’t covered. That’s a testament to how much XML functionality exists in SQL 2005. There are new capabilities for OPENROWSET, a modify method in the XML data type to manipulate XML data, and so much more. The objective of this article was to provide a starting point with some examples of the new XML. Here are some additional resources that provide additional information.
- Shawn Wildermuth, Making Sense of the XML DataType in SQL Server 2005, May/June 2006 issue of CoDe Magazine.
- Bob Beauchemin and Dan Sullivan, A Developer's Guide to SQL Server 2005 (Microsoft .NET Development Series) – there are multiple chapters on XML
Remember…GOOGLE is your friend. Sometimes just taking thirty minutes to Google some XML keywords will lead to more valuable information than you ever thought possible.
Writing CLR Stored Procedures in C#
As SQL Server 2005 rolls out DBA's are going to be forced to learn either C# or Visual Basic or both. Until now these were client side languages and not knowing them had little impact on your job. And if you write code in these languages your going to have to learn to use Visual Studio. This article covers the basics of C# and Visual Studio using a basic stored procedure as an example. It's written using the April CTP of SQL Server 2005.
This article shows you how http://www.sqlteam.com/item.asp?ItemID=21927"