Marcus Mac Innes' Blog

Irish Views on .NET, C# and of course "Services"...
posts - 48, comments - 447, trackbacks - 64

Synchronized .NET Collection Classes are NOT Thread Safe.

Call me a purist, but when someone asks me whether something is thread safe, I like to re-phrase the question in my head to something like "Is this air traffic control system thread safe" before answering...Yes or No?

So you can imagine my surprise then after finally tracking down a threading issue only to discover that the bug was in fact caused by the .NET synchronized Hashtable and it turns out that the synchronized Hashtable is in fact not thread safe, but thread safe "ish" and to me that's NOT thread safe at all.

The documentation for Hashtable and other collection objects which implement System.Collections.ICollection indicates that the static method Synchronized "Returns a synchronized (thread-safe) wrapper for the Hashtable.". Creating one coundn't be simpler, a small change in the Hashtable's initialisation provides you with thread safety...

Hashtable threadSafeHashtable = Hashtable.Synchronized(new Hashtable());

The Hashtable class contains a private class called SyncHashtable which simply subclasses Hashtable and overrides its properties and methods. In reality, these wrappers synchronize data access by simply wrapping access in a lock statement. Here is the Hashtable's synchronized Add method:

public override void Add(object key, object value)
{
      lock (this._table.SyncRoot)
      {
            this._table.Add(key, value);
      }
}

Perfect right?

Well not exactly... You see the Add method above is thread safe. Only one thread can add to the Hashtable at a time. But if we look a little closer, we find that not all properties and methods provide lock safety. Here is the implementation of the indexer which you will notice is missing a lock on the "get":

public override object this[object key]
{
      get
      {
            return this._table[key];
      }
      set
      {
            lock (this._table.SyncRoot)
            {
                  this._table[key] = value;
            }
      }
}

This is a typical example of naive thread safety. The assumption goes that mutliple threads can simultaneously read from an object safely, but only one thread at a time can write to the object safely. And this is the implementation we see in System.Collection.Hashtable (above).

This design works perfectly well and provides optimum performance since we are not holding up threads who only wish to read an objects data, they simply work away oblivious to each other, only when a thread wishes to write does it need to synchronize its access to the data. There is however one HUGE assumption in this design, the writes must be "atomic".

In some cases writes to memory locations are actually atomic. For instance setting a value of a bool or int on 32bit single processor machine in .NET 1.1 just happens to be atomic. We run into problems on multi processor machines however due to processor level memory caching and often need to qualify writes with System.Threading.Thread.MemoryBarrier() (discussed here) which flushes the processor's memory cache. But now we have 2 separate statements which together are definitely not atomic.

So why is this important when looking at the synchronization design of the System.Collections.Hashtable.SyncHashtable class?

Well if a thread always has the green light to perform a read, then it makes the assumption that the data to be read exists in a fully written state. If it were allowed to read "while" another thread was in the middle of write (albeit one thread writing at a time), then the data could be half written when the read is performed and hence we have a subtle and often difficult to locate threading bug.

Other methods on SyncHashtable that fall foul of the design flaw include: Contains, ContainsKey, ContainsValue, CopyTo, GetEnumerator, GetHash and KeyEquals.

Don't get caught out!

[Update:] As of .NET v2.0.40607 SyncHashtable.CopyTo has been upgraded to include a lock.

posted on Tuesday, September 21, 2004 12:01 PM

Feedback

# Synchronized .NET collection classes not thread safe?

9/22/2004 5:08 PM | Frans Bouma's blog

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

Hi Marcus:

Very good catch, I'm one who has made the assumption that all methods on a synchronized collection would involve locks for complete thread safety.

One note, however, is that a memory barrier has nothing to do with atomicity or processor cache. A memory barrier is all about preserving the ordering of read and write operations. If something happens with the cache because the compiler or hardware have to preserve ordering, that would be a side effect. The double check locking problem BradA discusses could happen on an architecture without cache, but with weak ordering.
9/22/2004 7:29 PM | Scott Allen

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

Actually it does have to do with Cache. The misordering of reads/writes is a model of what happens when multiple processors are looking at data through the cache.
9/22/2004 9:12 PM | Dan Golick

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

OK I think you picked a bad example.
With careful study I think we can find some problems in the SyncHashtable implementation. I've always been suspicious of CopyTo.

First of all the data stored in the collection cannot be "half written". Why? Because the collection only stores objects. The writing/reading of an object is guaranteed to be automic (just like Int32). An object handle (the value written to the hashtable) is actually a 32-bit pointer. Even if we are storing a big-old struct in the hashtable it is boxed before being written as the value. Thus the copy in and out of the value is atomic (note that if you are not boxing the larger types like double and Int64 can be half-written so multithreaded code must protect access to them).

The second question is the call this._table[key]; going to be thread-safe. This ends of calling the base implementation:
public virtual object this[object key]
{
 get
 {
  uint num1;
  uint num2;
  Hashtable.bucket bucket1;
  if (key == null)
  {
   throw new ArgumentNullException("key", Environment.GetResourceString("ArgumentNull_Key"));
  }
  Hashtable.bucket[] bucketArray1 = this.buckets;
  uint num3 = this.InitHash(key, bucketArray1.Length, out num1, out num2);
  int num4 = 0;
  int num5 = (int) (num1 % bucketArray1.Length);
  do
  {
   bucket1 = bucketArray1[num5];
   if (bucket1.key == null)
   {
    return null;
   }
   if (((bucket1.hash_coll & 0x7fffffff) == ((long) num3)) && this.KeyEquals(key, bucket1.key))
   {
    return bucket1.val;
   }
   num5 = (int) ((num5 + ((long) num2)) % bucketArray1.Length);
  }
  while ((bucket1.hash_coll < 0) && (++num4 < bucketArray1.Length));
  return null;
 }
 set
 {
  this.Insert(key, value, false);
 }
}




Now the key for thread-safety here is:
Hashtable.bucket[] bucketArray1 = this.buckets;
The get code does a 1 line copy of the buckets. This is a thread-safe copy. Of the array in case buckets is reallocated by a writer thread.


Bottom-line is I believe this[key] is thread-safe on intel/amd processors.

Nonetheless I'm still suspicious of some of the read functions particularly because the way they have written the code uses the order of reads and writes to guarantee thread safety.

If you have looked into MemoryBarrier you will see that the CLR does not guarantee that read and writes will appear to have been done in the coded order. That is if on processor 1 I write A, B, C on processor when I read the three items it may appear as though B was written before A. This is a cache coherency problem. The Monitor.Enter/Exit (lock(syncroot)) guarantees memory barriers at the enter and exit. So if SynchHashtable locked the read accesses it guarantees that there is no cache coherency problem.

On Intel/AMD platforms the processor actually provides a stronger memory model that the CLR requires. On these platforms I think the code is safe. But if we run on some future platform with weaker guarantees the read code may not be safe.


Hashtable is interesting because the framework "guarantees" that is thread safe for a single writer and multiple readers. Unfotunately it doesn't quite hold up on this promise . See Brad Abrams: http://blogs.msdn.com/brada/archive/2003/04/13/49969.aspx

Given this guarantee the implementer of SyncHash felt it was safe to not lock read-only accesses. I think this was unfortunate because it is really hard to prove that these accesses are thread safe.

I have my own wrapper around Hashtable to use for synchronized access that locks all accesses. I sleep easier at night knowing that I am accessing my reads and writes inside a lock(syncroot).
9/23/2004 2:09 PM | Dan Golick

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

Thanks for the analysis Dan. On inspection, your conclusion regarding the indexer seems tight. I used it because it was a simple one to understand. In retrospect I should have made an example of CopyTo which is the worst problem. But as you can see from my update CopyTo seems to have been fixed in .NET 2.0.
9/23/2004 2:46 PM | Marcus Mac Innes

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

Dave:

"Now the key for thread-safety here is:
Hashtable.bucket[] bucketArray1 = this.buckets;
The get code does a 1 line copy of the buckets. This is a thread-safe copy. Of the array in case buckets is reallocated by a writer thread. "

That's not a deep copy of the array, it is copying a reference to the array. You couldn't copy a 1,000 element array with this one line and call it thread safe without a lock.
9/30/2004 3:29 PM | Scott Allen

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

thanks good article.. i almost used this but will keep my locks in place
12/3/2004 9:55 AM | rx

# Dan, you missed the point and glossed over the details

Quoting your comments:
===========================================
Now the key for thread-safety here is:

Hashtable.bucket[] bucketArray1 = this.buckets;

The get code does a 1 line copy of the buckets. This is a thread-safe copy. Of the array in case buckets is reallocated by a writer thread. Bottom-line is I believe this[key] is thread-safe on intel/amd processors.
===========================================

Yes, that line of code is thread-safe (if the claimed about "object reference assignment is thread-safe" is true, as i'm not sure which documentation can back this up), but what about "the way bucketArray1 was used later on" ?
4/26/2005 7:01 AM | Bill V

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

> Bottom-line is I believe this[key] is thread-safe on intel/amd processors

ok, suppose youhave a couple of threads running. Thread 1 calls this[key]. Within the get indexer, thread 1 gets to the point where this.buckets is copied into bucketArray1. This is thread-safe as you pointed out. But let's not be too hasty before deciding that this[key] is thread safe, too.

The reader thread continues, and gets to the point where it compares the passed in key, against the key for the given bucket. It's a match. It's just about to call "return bucket.val" when....

a thread switch happens.

The 2nd thread coincidentally is writing the same hashtable. Imagine that! Thread 2 does this in a couple operations.
First, thread 2 removes a key. Coincidentally, the key it removes is the same key used in thread 1, suspended within the get indexer. Now the bucket that contained that key is "free", and the key for that bucket is null.

The 2nd thread continues, and it adds a new key/value pair. Just by a stroke of luck, the new key happens to get stored in the same bucket that had stored the old key. Suppose thread 2 initializes the value, also. Or maybe it doesn't. Doesn't matter. Thread 2 is now suspended by the scheduler. The bucket now contains a brand-new key/value pair.

Thread 1 wakes up, calls "return bucket.val". By now the key he is referencing is invalid, but thread 1 does not know that. The bucket reference he has now points to a different key. And a different value.

Boom!

6/3/2005 12:42 AM | dino

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

Im not 100% sure I agree with your example in this case. Im Im reading it wrong, I apologize and steer me correct : )
You're saying the chance exists the write will not occur as an atomic operation but the read will expect it there. The write isn't an atomic operation since it occurs like such:
After calculating the hash and determining the bucket we see:
this.buckets[bucketNum].val = nvalue;
this.buckets[bucketNum].key = key;
this.buckets[bucketNum].hash_coll |= ((int)hashKey );
this.count++;
so if we assume the read can occur in the middle of this we'd have a value written to a bucket but not yet a key.
However, if we check the Item[] method, you'll see:
if (bucket1.key == null)
{
return null;
}

which would prevent against this. So reading a requested key at any time before the key is written (in of itself considered atomic - since we're just assigning a pointer to the object or a boxed value - since the last assembly language instruction that would matter would be the move of this value into the DWORD PTR representing the key) would just return a null value since the bucket's key isn't yet set.
Thoughts? send an email to my first name @secure-coding.com
6/29/2005 6:35 PM | Adam Tuliper

# re: Synchronized .NET Collection Classes are NOT Thread Safe.


http://www.vatan.tc
http://www.internet7.org
http://www.e-sorgulama.com
http://www.forumitiraf.com
http://www.islamiruyatabirleri.com
http://forum.vatan.tc/sitemaps-0.html
http://forum.vatan.tc/sitemaps-1.html
http://forum.vatan.tc/sitemaps-2.html
http://forum.vatan.tc/sitemaps-3.html
http://forum.vatan.tc/sitemaps-4.html
http://forum.vatan.tc/sitemaps-5.html
http://forum.vatan.tc/sitemaps-6.html
http://forum.vatan.tc/sitemaps-7.html
http://forum.vatan.tc/sitemaps-8.html
http://forum.vatan.tc/sitemaps-9.html
http://forum.vatan.tc/sitemaps-10.html
http://forum.vatan.tc/sitemaps-11.html
http://forum.vatan.tc/sitemaps-12.html
http://forum.vatan.tc/sitemaps-13.html
http://forum.vatan.tc/sitemaps-14.html
http://forum.vatan.tc/sitemaps-15.html
http://forum.vatan.tc/sitemaps-16.html
http://forum.vatan.tc/sitemaps-17.html
http://forum.vatan.tc/sitemaps-18.html
http://forum.vatan.tc/sitemaps-19.html
http://forum.vatan.tc/sitemaps-20.html


4/10/2008 10:54 AM | forum

# re: Synchronized .NET Collection Classes are NOT Thread Safe.

Thanks for the info!
Excellent article, added to bookmarks.
11/5/2008 3:41 PM | buy viagra
Post a new comment about this topic
Title  
Name  
Url

Comments   
Enter the code you see: