Archive for the ‘Sys. Admin.’ Category

Listing Table Sizes

Friday, December 24th, 2010

Databases are a pain in the neck to look after, poorly designed models and processes that don’t remove temporary data can cause a database to grow in size. A database that is allowed to grow large beyond its requirements becomes a burden on the nightly backup, takes longer to restore in the event of a recovery scenario and slows down the development process by preventing developers from testing things out on “live” data.

More often than not I have found that the problem lies with log or analytic tables sometimes this information is liberally logged (which it should be) and then totally ignored without a thought for trimming the data on a regular basis.

SQL Server Management Studio provides a way of looking at the storage usage of tables individually from the properties context menu item of the table.

SSMS Storage Properties

In large databases this can be laborious, I found a script that will collect this information and present it as a table. I have adapted it a little so that I can see the total size of the table and sort by each column to drill down to the problem tables.

SET NOCOUNT ON
CREATE TABLE #spaceused (
  name nvarchar(120),
  ROWS CHAR(11),
  reserved VARCHAR(18),
  DATA VARCHAR(18),
  index_size VARCHAR(18),
  unused VARCHAR(18)
)
 
DECLARE TablesFromSysObjects CURSOR FOR
  SELECT name
  FROM sysobjects WHERE TYPE='U'
  ORDER BY name ASC
 
OPEN TablesFromSysObjects
DECLARE @TABLE VARCHAR(128)
 
FETCH NEXT FROM TablesFromSysObjects INTO @TABLE
 
WHILE @@FETCH_STATUS = 0
BEGIN
  INSERT INTO #spaceused EXEC sp_spaceused @TABLE
  FETCH NEXT FROM TablesFromSysObjects INTO @TABLE
END
 
CLOSE TablesFromSysObjects
DEALLOCATE TablesFromSysObjects 
 
SELECT	name AS TableName,
		ROWS AS ROWS,
		CAST(LEFT(reserved, LEN(reserved) - 3) AS INT) AS Reserved,
		CAST(LEFT(DATA, LEN(DATA) - 3) AS INT) AS DATA,
		CAST(LEFT(index_size, LEN(index_size) - 3) AS INT) AS IndexSize,
		CAST(LEFT(unused, LEN(unused) - 3) AS INT) AS Unused,
		(CAST(LEFT(reserved, LEN(reserved) - 3) AS INT) + CAST(LEFT(DATA, LEN(DATA) - 3) AS INT) + CAST(LEFT(index_size, LEN(index_size) - 3) AS INT) + CAST(LEFT(unused, LEN(unused) - 3) AS INT)) AS Total
FROM #spaceused
ORDER BY Total DESC
DROP TABLE #spaceused

Ordinance Survey OpenData (Part 3 – Cleaning Up)

Friday, December 17th, 2010

If you look through the schema of the table we imported in Part 2 there are a number of unused fields and some of the data appears to be missing.

Cleaning up the Schema

  1. You can go right ahead and remove the fields that start with “Unused” as far as I can tell the full version of Code-Point uses these fields.
  2. Remove the nullable attributes from all of the fields, this will prevent us from doing something silly at a later date, and will avoid Object Relational Mappers such as Entity Framework from creating nullable data types.
  3. Many of the fields contain codes not data itself but codes that describe other data, so lets prepend code on the end of those fields for the time being.

Cleaning up the Data

The quality column in Code-Point Open describes the source and reliability of the data, it ranges from the most accurate 10 through to no data 90, when building a system around this data you need to decide at what data is important to your use case. The following query will give you an idea of the quality of the dataset as a whole, I have annotated it based upon the OS Code-Point documentation.

SELECT Quality, COUNT(*) AS COUNT
FROM [OSOpenData].[dbo].[CodePointOpenCombined]
GROUP BY Quality
ORDER BY Quality
Quality Count Description
10 1683975 Within the building of the matched address closest to the postcode mean determined automatically by Ordnance Survey.
20 73 As above, but determined to visual inspection by GROS (General Register Office for Scotland).
30 1086 Approximate to within 50 m of true position.
40 52 The mean of the positions of addresses previously matched in ADDRESS-POINT but which have subsequently been deleted or recoded.
50 4395 Estimated position based on surrounding postcode coordinates, usually to 100 m resolution, but 10 min Scotland.
60 93 Postcode sector mean (direct copy from ADDRESS-POINT).
90 6361 No coordinates available.

For my purposes I want to use the coordinate data stored in the Eastings and Northings columns, which makes postcodes with no data useless to me, I can remove these with the following SQL script:

DELETE FROM [CodePointOpenCombined]
WHERE [Quality] = 90

Ordinance Survey OpenData (Part 2 – Importing The Data)

Friday, December 10th, 2010

All of the data is in different files; SSIS is capable of extracting data from multiple files however for the purposes of this article I am going to stick to the Import Export Wizard.

To combine all of the files into one (big) file a quick switch to the command prompt is required:

type data\*.csv > .\CodePointOpenCombined.csv

Because none of the data files have headers this works fine, if they did have headers some work would be needed to strip those out.

Create a new database in SQL Server then follow these steps:

  1. Right Click the Database select “Tasks” – “Import Data”.
  2. In the Data Source step change the drop down to “Flat File Source”.
  3. Select the combined file we created above (you may have to change the filter).
  4. Check the Columns page if Quotation Marks (“) appear in some of the columns change the Text qualifier field on the General Page to a “.
  5. On the Advanced page click Suggest Types.
  6. Set the number of rows to 1000 (the maximum), then click OK.
  7. Go through each column and update the name and DataType to match those we discovered in the previous post.
  8. Check the correct database and table are selected on the next two steps.
  9. Click Next then Next again, then check over the data type mappings.
  10. Click Next then ensure Run immediately is checked then click finish.
  11. All being well, all of the data will be imported successfully.

If there are problems importing the data you can go back and make changes to the configuration, typically the issue is incorrect data types (too small) or incorrect text delimiters.

You may be asking why we went to tall that trouble, and time, only to let the Import Data Wizard suggest the data types. The reason I wrote the script was the wizard is limited to checking the first 1,000 lines; even if you set the value to 2,000,000 it will default down to 1000 after you move your focus away.

The result being if your data is naturally sorted on a specific column as some of the Ordinance Survey data appears to be the import will fail. Running the schema scanner allows you to scan through all of the data so that you can modify the suggested data types to match the maximum values.

Ordinance Survey OpenData (Part 1 – Schema Scanner)

Friday, December 3rd, 2010

In April 2010 the Ordinance Survey released certain parts of their data under special licence which allows for commercial use without cost. All the types of data made available are outside the scope of this post although I hope that the techniques described could be applied to any data set not limited to Ordinance Survey data.

In this post I am going to look at Code-Point Open, a list of all UK postcodes with their corresponding spatial positions. Unlike many other OS OpenData downloads the ZIP file does not contain the User Guide or the Schema Data, this can be found on the website, I spent a good 10 minutes searching for this data.

The term for what we are doing in this post is Extract-Transform -Load (ETL), a process in which we take data in one format and covert it for use in another format. Generally ETL is used to take a flat file format and load it for use in a relational database, although technically any format or database could be used. SQL Server offers two built-in mechanisms to perform ETL; the “Import Export Wizard” and SQL Server Integration Services (SSIS). The “Import Export Wizard” actually creates a SSIS package in the background and is available to all versions of SQL Server, SSIS  is not available in SQL Express.

Before we create a table in a SQL Server Database we need to know something about the data we are importing, the documentation for Code-Point Open tells us the data contains the following fields:

Postcode, Quality, Unused1, Unused2, Unused3, Unused4, Unused5, Unused6, Unused7, Unused8, Eastings, Northings, CountryCode, RegionalHealthAuthority, HealthAuthority, AdminCounty, AdminDistrict, AdminWard, Unused10

A number of the fields are not used, the fields and the dummy data held within them will be weeded out at a later date, we know the fields but we don’t know the format of the data it contains, it could be numeric, strings, decimals, telephone numbers? I created a PowerShell script which scans through all of these files to work out what type of field it is and the range of data held within it, be warned it will take a few hours to run!

# Schema Scanner v1.0
# ©2010 Richard Slater
 
# Create an empty hash table
$columns = @{}
 
# Loop through every file that matches this pattern
foreach ($file in Get-ChildItem -Path "D:\OSOpenData\Code-Point Open\data\ze.csv")
{
	Write-Host "Processing $file"
 
	# PowerShell Import-Csv cmdlet is pretty powerful, but if there is no header row you must feed it in
	$PostCodeData = Import-Csv $file -Header "Postcode","Quality","Unused1","Unused2","Unused3","Unused4","Unused5","Unused6","Unused7","Unused8","Eastings","Northings","CountryCode","RegionalHealthAuthority","HealthAuthority","AdminCounty","AdminDistrict","AdminWard","Unused10"
 
	# Go through each row in the file
	foreach($row in $PostCodeData)
    {
		# Go through each column in the row
		foreach ($attr in (Get-Member -InputObject $PostCodeData[0] -MemberType NoteProperty))
		{
			$key = $attr.Name
 
			# Ignore unused columns
			if ($key.StartsWith("Unused"))
				{ continue }
 
			# Construct an object to store the meta data, store it in the hash table to retreive next loop
			$column = New-Object PSObject
			if (!$columns.ContainsKey($key))
			{
				$column | Add-Member -Type NoteProperty -Name StringLength -Value 0
				$column | Add-Member -Type NoteProperty -Name MaxValue -Value ([System.Int32]::MinValue)
				$column | Add-Member -Type NoteProperty -Name MinValue -Value ([System.Int32]::MaxValue)
				$columns.Add($key, $column)
			}
			else
				{ $column = $columns.Get_Item($key) }
 
			$isInt = $false
			$value = 0;
 
			# Work out if this is an integer type
			if ([System.Int32]::TryParse($row.($key), [ref] $value))
            	{ $isInt = $true }
 
			if (!$isInt)
            {
				# it is not an integer how many characters is the string
            	if (($row.($key)).Length -gt $column.StringLength)
                	{ $column.StringLength = ($row.($key)).Length }
 
				continue
            }
 
			# it is an integer start working out the maximum and minimum values
			if ( $value -gt $column.MaxValue ) { $column.MaxValue = $value }
			if ( $value -lt $column.MinValue ) { $column.MinValue = $value }
 
			$columns.Set_Item($key, $column)
		}
	}
}
 
# Print a report of all of the fields
foreach ($field in $columns.Keys)
{
	$stringLength = $columns[$field].StringLength
	$numericMax = $columns[$field].MaxValue
	$numericMin = $columns[$field].MinValue
 
	if ($stringLength -gt 0)
	{
		Write-Host "$field (String) : Length =" $columns[$field].StringLength
	}
	elseif (($numericMax -gt ([System.Int32]::MinValue)) -and ($numericMin -lt ([System.Int32]::MaxValue)))
	{
		Write-Host "$field (Numeric) : MaxValue =" $numericMax ", MinValue =" $numericMin
	}
	else
	{
		Write-Host "$field (Empty)"
	}
}

The output from the script should give you enough information to construct a nice tight schema to import the data:

AdminWard (String) : Length = 2
AdminDistrict (String) : Length = 2
AdminCounty (Numeric) : MinValue = 0 , MaxValue = 47
Quality (Numeric) :  MinValue = 10 , MaxValue = 90
RegionalHealthAuthority (String) : Length = 3
Postcode (String) : Length = 7
Eastings (Numeric) : MinValue = 0 , MaxValue = 655448
Northings (Numeric) : MinValue = 0 , MaxValue = 1213660
CountryCode (Numeric) : = 64 ,  MaxValue   = 220
HealthAuthority (String) : Length = 3

In a future post I am going to take it to the next stage; create a table and complete the import with the Import Export Wizard. I would also like to improve the performance of the schema scanner by converting the code into C#.

SchemaScanner

Elevated Command Prompt

Thursday, October 7th, 2010

I explained how to get an elevated Command Prompt to perform system tasks in the comments of my post about setting the MTU in Windows 7, I am writing the up a bit clearer and linking it from that post.

In Vista and Windows 7 applications don’t automatically get administrator privilege, they either need to request it or the user needs to explicitly start the application as an Administrator. The way to do this with the Command Prompt is as follows:

  1. Press the “Start” button.
  2. Type “Command”.
  3. “Command Prompt” will be shown in the search results.
  4. Right Click “Command Prompt” and select “Run as Administrator” (it will have a blue and yellow shield beside it).
  5. When prompted click “Yes” to allow Command Prompt to start as Administrator.
  6. You will know it has worked because the title bar will start with “Administrator:”

ASP.net 3.5 GridView RowCommand event fired twice

Thursday, April 1st, 2010

I am writing this up to hopefully save someone else time in the future, this particular problem took up six hours of my day yesterday causing quite a bit of frustration for me, as the developer, and the users of the application.

If you are searching for the solution scroll down to the bottom of the page where I will outline the solution I used to resolve the problem. It is also worth pointing out that this does appear to be fixed in .NET 4. Certainly I was able to consistently reproduce the problem with VS2008/.NET 3.5 on multiple different computers. However after converting the project to VS2010/.NET 5 I haven’t seen the issue.

Explanation of the problem

I wrote and maintain an application that publishes a list of courses and allows users to book onto these courses, what I have listed below is a simplified version of this application.

The administration console contains two lists:

  • Published Courses – courses visible to all employees.
  • Unpublished Courses - courses waiting to be published, only visible from the administration console.

Courses can be freely published (i.e. moved from Unpublished to Published) by clicking green tick. Courses that have not had any bookings made can be unpublished by clicking the red cross.

The cross and the tick are implemented as GridView ButtonFields:


<asp:ButtonField ButtonType="Image" CommandName="UnpublishCourse"
    ImageUrl="~/images/unpublish.png" InsertVisible="False" Text="Unpublish" />

This application has been running for six months, the issue had not been observed up until yesterday. The user explained to me that when they were publishing courses they were always published in pairs, equally when unpublishing courses it was being done in pairs, concealingly unpublishing a course with bookings.

Investigating the problem

Initially I tried to reproduce this on my local machine, backed up and subsequently restored the database locally made sure I was running the same revision as the server and fired it up. Couldn’t reproduce the problem, no matter how fast I clicked it wouldn’t happen. Tried various permutations of code and database but could only reproduce on the server.

Refreshed the binaries on the server with the HEAD from subversion, problem was still happening most of the time. I confirmed that it wasn’t an issue with the stored procedures by running them manually through LinqPad.

I started putting debug statements at the entry points to the critical parts of the code, this yielded an interesting output on my development machine, each time the cross or the tick was clicked UnpublishedGridView_RowCommand was fired twice. This gave me something to search for, seems I am not the only one to have this problem, Microsoft tried to reproduce it in 2006 but couldn’t.

Solving the problem

As it turns out there are several ways of fixing the problem, several people have used timers to “debounce” the RowCommand event, assuming that the event is always going to be fired twice a session variable can be used to filter out the second event.

Because the event is only fired twice when ButtonType=”Image” not when ButtonType=”Link” you can set the text property to the HTML to render your image. This resulted in the code above becomming:


<asp:ButtonField ButtonType="Link" CommandName="UnpublishCourse"
    InsertVisible="False" Text="<img src=images/unpublish.png />" />

This proved to be the simplest possible solution, Visual Studio 2008 throws a warning about ASP.net validation, but I can live with that as long as the application works. In addition to the simplicity of the solution it also continues to work in ASP.net 4 (which doesn’t exhibit the double event behaviour).

OneNote vs Evernote

Saturday, February 27th, 2010

Somewhere in the middle of 2007 I was encouraged to use OneNote to clear my desk and move to a “paperless” system, initially this was a little painful as it seemed a gargantuan task to scan in all of the bits of paper on and around my desk that appeared to contain useful information.

As it turned out I realised that if a bit of paper was covered by another (or in fact covered by anything) it wasn’t that important to the execution of my role and could probably be thrown in the bin.

At the time I was not using Microsoft Office at home, opting to use OpenOffice for the limited needs I had for productivity software. I did however want a better way of organising my paperwork at home, OneNote 2007 came in at about £70 which isn’t unreasonable for what you got. Then I discovered Evernote.

Seemed perfect, I don’t generate so much paperwork that I would bust the 40mb/month limit on the free account. In the end I decided to adopt Evernote at home and continue to use OneNote at work, it proved quite a handy separation of work and life.

Recently I have run into two problems that are pushing me towards using Evernote for everything, and ditching OneNote entirely:

  1. Evernote handles PDFs really well, you drag them in and they are displayed using the Foxit rendering engine. It just works. OneNote on the other hand plain old embeds them into the note, great now how is that different from having them in a folder in My Documents.
  2. Evernote 3.5 has vastly improved the synchronization mechanism meaning that I can safely put something on Evernote on my PC and it will be on my laptop shortly after it is turned on next. Microsoft has tried to get this kind of functionality into OneNote and SharePoint however it just doesn’t work that well, it is too slow and there seems to be a 10 minute refresh cycle hard coded into the product.

I am still not sure that I want to ditch OneNote entirely, the 2010 version has some nice labour saving devices built in such as quick screen clippings and image formatting with the fluid user interface. Nothing in OneNote 2010 screams “don’t leave me” though.

Login failed for user ”

Monday, January 25th, 2010

There is an excellent post on the SQL Protocols blog about diagnosing the “Login failed for user ”. The user is not associated with a trusted SQL Server connection.” message displayed by SQL Management Studio and other applications which use the same API; Notice the blank username ”.

I believe there is one possibility missing from the above post: that is the Group Policy setting “Deny access to this computer from the network”. Which can be found in both Domain Group Policy and Local Security Policy in the following path:

Computer Configuration » Windows Settings » Security Settings » Local Policies » User Rights Assignment.

I have been using this policy more and more to lockdown access to site systems in accordance with our security and access policy. It pays to be cautious when applying User Rights Assignment policies to a machine, as in Windows 2003/XP they are not very granular.

Card Reader on Acer Aspire 5100 Series Under Windows 7

Monday, October 26th, 2009

Update (15/11/2010): the drivers listed in this post are out of date and may cause a BSOD, several alternatives are listed in the comments; however Microsoft appear to have approved 64-bit drivers on Windows Update.

I am typing this on my Acer Aspire 5102WLMi which is one of the popular (if flawed) Acer Aspire 5100 series; I rescued this one from the Balconi Test by putting a bit of rubber (it was a cut down rubber foot) on top of the South Bridge chip set, that however is not the story I am telling today.

I never bothered to install the Card Reader driver on this laptop while I was running the Windows 7 Beta, mainly because I am lazy, but also I didn’t have a need for it so it never came up. With the release of Windows 7 I wanted to get the system perfect, seeing as hopefully it will last a good year in it’s present state, and I wanted to be able to re-arrange the SD card from my Acer PDA.

Windows 7 x64 was unable to identify a driver for this particular card reader, this left me with three unknown devices in Device Manager:

Missing Drivers Acer 5100

The Acer website was a bust, as far as Acer are concerned this laptop won’t even run Vista x64, so I had to dig deeper. From past experience of looking for drivers without using Windows Update I knew that I could probably identify the manufacturer from the Hardware and Device ID’s available through Device Manager. If you want to follow along here are the steps:

  1. Open up Device Manager (Right Click “Computer”, Choose “Manage”, Select “Device Manager”)
  2. Identify your unknown devices (They will look similar to the image above, although the text will differ)
  3. Right click one of them and select “Properties”
  4. Switch to the “Details” tab
  5. Change the property drop down box to read “Hardware Ids”

What that will give you is one or more strings looking something like this

PCI\VEN_1524&DEV_0530&SUBSYS_009F1025&REV_01

I have marked the two important parts in bold, the four digits after “VEN_” tell you the PCI Vendor number, the four digits after “DEV_” tells you device number these two numbers should uniquely identify the driver.

There are several sites that allow you to lookup these numbers, I tend to use the publicly available PCI Vendor and Device Lists at PCIDatabase.com. Which has always given me good results with minimum fuss and adverts.

Armed with the above I identified the manufacturer of the Card Reader was ENE Technologies, sometimes this is all you need to find the driver. You can Google/Bing the name and click the download or support links and get the latest drivers. This isn’t always the way, as some OEMs don’t offer drivers leaving that down to the system integrator to offer that service.

So some time with Bing, I found some drivers for various ENE Devices, however the drivers available from VersionTracker seemed promising. After downloading and unzipping the contents of the file to a folder on my Desktop, I was able to point Device Manager at these files for each of the unknown devices I was left with three working devices and a fully operational Card Reader.

ENECardReaderDriversAcer5100

Hope this helps some other people with similar laptops or Card Readers, post in the comments with your experiences, please include the manufacturer and model of the laptop/netbook you have succeeded with and hopefully you will help someone else with the same devices.

Change your MTU under Vista or Windows 7

Friday, October 23rd, 2009

This information is available in many many other places, however I am putting it on here because I know it will be here for me to refer to. Also it is handy, as I know I can access my web-site even if the MTU is misconfigured.

For some reason that has escaped me Path MTU Discovery in Windows just doesn’t seem to figure out the MTU for a given path (something to do with routers being poorly configured to not respond to ICMP requests). So Windows uses the default. For the most part this doesn’t affect anyone, however if it dos affect you, it really annoys you. Failure of PMTUD will result in some websites not loading correctly, having trouble connecting to normally reliable online services and general Internet weirdness.

The resolution is to set your default MTU to one lower than the Ethernet default of 1500. Here is how:

Step 1: Find your MTU
From an elevated CMD Shell enter the following command:

netsh interface ipv4 show subinterfaces

You should get something like this

MTU         MediaSenseState  Bytes In    Bytes Out  Interface
----------  ---------------  ---------   ---------  -------------
4294967295  1                0           13487914   Loopback Pseudo-Interface 1
1500        1                3734493902  282497358  Local Area Connection

If you are using Ethernet cable you will be looking for “Local Area Connection” or “Local Area Connection 2″ (if you happened to plug into the second network port). If you are using Wireless you will be looking for “Wireless Network Connection”. The MTU is in the first column.

Step 2: Find out what it should be

In the CMD shell type:

ping www.cantreachthissite.com -f -l 1472

The host name should be a site you can not reach, -f marks the packet as one that should not be fragmented the -l 1472 sets the size of the packet (1472 = Ethernet Default MTU – Packet Header, where the Ethernet Default MTU is 1500 and the Packet Header is 28 bytes)

If the packet can’t be sent because it would need to be fragmented you will get something similar to this:

Packet needs to be fragmented but DF set.

Keep trying lower packet sizes by 10 (i.e. -l 1460, 1450, 1440, etc.) until you get a successful ping request. Raise your packet sizes by one until you get a “Packet needs to be fragmented but DF set.”. The last successful value plus 28 will be your MTU value.

In my case a packet size of 1430 succeeds but 1431 fails, so 1430 + 28 = 1458.

Step 3: Set your MTU

Now you have identified the interface you need to change and the ideal MTU for you, now it is time to make the change. Again from an elevated CMD Shell type the following replacing my MTU of 1458 with your own value:

netsh interface ipv4 set subinterface "Local Area Connection" mtu=1458 store=persistent

Or if you are using a Wireless connection:

netsh interface ipv4 set subinterface "Wireless Network Connection" mtu=1458 store=persistent

If all has gone well you should have a perfectly working internet connection.