EsoErik

Saturday, June 30, 2012

 

Efficient Disjoint Subset Detection for [2,n] Cardinality Sets

Note: it turns out to be faster to do subset search implicitly with chain search using Fork-Fulkerson max flow.  (And is even faster using Hopcroft-Karp bipartite maximum matching.)

My Sudoku solver's disjoint subset logical reduction routine logical reduction routine detects when any n cells in a row, column, or block share the same n possible values.  Those cells are the only cells in that row, column, or block that may contain those values, allowing the corresponding possible values for the other cells to be reduced.  Menneske.no has a good explanation of this operation.

 I represent the possible values with a boost::dynamic_bitset.  For a cell in a 9-wide sudoku with possible values 1, 3, and 9, the corresponding bitset would be the binary number b101000001.  Imagine that another cell in the same row has possible values 2, 3, 6, and 9, and the bitset b011001001.  Bitwise AND of their two bitsets would yield b001000001 - cardinality 2 for AND of 2 cells, indicating that these cells form a disjoint pair and that no other cells in the row may contain 3 or 9.

How about identifying disjoint sets of 3, 4, 5, or perhaps even 65535 cells (for a very large Sudoku puzzle)?  A clustering algorithm using a modified Hamming distance operator to compare similarity rather than discrepancy might help, but the algorithms I investigated were not trivial to adapt for providing the exact solution required.

At first blush, bitwise ANDing all combinations of cells for all n values seemed impractical.  Consider that 256 choose 2 (to identify all pairs in a 256-wide Sudoku) yields 32640, and 256 choose 10 yields 278,826,214,642,518,400...  Some method to avoid searching the incredibly vast number of useless cell sets is required.  So, I went outside with a notepad and a beer and set about drawing Sierpinski triangles and considering methods of stacking results to avoid redundant calculation and useless sets.  Some sketching and fiddling later, I found the fractal pattern, illustrated below, where A, B, ..., G are cell bitsets.  AND the color on the left with the column of the same color on the right to produce the column of that color below in the next iteration.

This works quite well and solves any 3x3 (9 wide) Sudoku in less than a microsecond on my workstation.  16x16 (256 wide) puzzles still take minutes, so I'm thinking about various ways store more book keeping information and avoid the per-row, column, and block AND operations entirely.  It will certainly be necessary (along with more RAM!) in order to solve 256x256 (65536-wide) puzzles.

Archives

July 2009   August 2009   September 2009   October 2009   November 2009   December 2009   January 2010   September 2010   December 2010   January 2011   February 2011   April 2011   June 2011   August 2011   February 2012   June 2012   July 2012   August 2012   October 2012   November 2012   January 2014   April 2014   June 2014   August 2014   September 2014   October 2014   January 2015   March 2015   April 2015   June 2015   November 2015   December 2015   January 2016   June 2016   August 2016   January 2017   March 2017   April 2018   April 2019   June 2019   January 2020  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]