SQL work-vacation, T-2 days (#Workation #PASSSummit)

Well here we are! Sean and I have traveled via our favorite domestic airline – Southwest – from Dallas to Seattle, for our annual vacation / work week / SQL reunion: PASS Summit!

summit-is-coming

I’m going to try writing a bit about each day I’m here, just because it might be useful. Or, in a year or two, nostalgic. Or just to have something like letters from camp;

Hi there,  internet friends, today at camp it rained some more, and I talked to 30 new friends, and I got a squishy stress ball from the vendor in the next booth…

One Decade of Dallas to Seattle

This will be my tenth year at PASS Summit! Ten years ago, I won a little contest that got little, lower-midlevel Jen to the Summit, and introduced me to a whole heckofa lot of SQL people online. That was fun

Since then I’ve had a lot of different experiences with Summit. We spent several years interviewing really cool, intelligent, and/or famous ( 🙂 ) SQL people. We have spoken at Summit a few times, been on panels,  spoken to vendors and reps, played games, and even did a precon one year. We’ve hosted parties and meetups, raised money for charity, ran short runs and long runs, and we (okay, I) have sun an AWFUL lot of karaoke, fueled by pots of green tea and/or beer.

Not technically Summit (this was in Chicago), but it still counts.
Not technically Summit (this was in Chicago), but it still counts.

It’s a good time.

“But I’m not at Summit,” you say.

And I always feel for the folks who don’t get to make it. After all, it’s a week out of your life, AND it’s a big expense that your company won’t always spring for, and so on.

So, while I will be writing at least one blog a day about our experiences here in Seattle this week, I will also do my best to bring you, you fine people at home, various goodies and resources and freebies, straight to your computer monitors, from yours truly. you deserve it.

What’s coming up?

We always try to come to Seattle early so that we can enjoy the city, and meet up with a friend or two, before the real rush starts. Tonight we’re all settled into a lovely apartment near the waterfront, and we’ll go get some supplies for dinner / breakfast / tea / coffee.

Tomorrow – Sunday – who knows? Maybe a rainy bike ride around the city, most likely some time with friends.  And I absolutely HAVE to finish up the last bits of the SQL Yearbook, a project I’ve assigned myself after years of hoping someone else would come up with the idea independently. 🙂 No such luck. It’s been a fun project, though.

A small preview of one section of SQL Yearbook 2018.
A small preview of one section of SQL Yearbook 2018.

Monday is when things start up for real. Sean and I are teaching a session each at the Seattle Freecon, which I’m looking forward to! And in the evening, we ourselves are hosting our annual Donut Meetup, sponsored by MinionWare (also us).

We can talk about Tuesday onward another day, but you’ll get a general idea if you glance over my “All about PASS Summit 2018” post.

Okay, TTFN. Time to go pick up supplies!

Happy days,
Jen

Regular Expressions: tagging expressions

I re-discovered something very cool while using RegEx in SSMS today: tagging expressions and referencing them in the same line!

I’ll take a minute to define regular expressions, explain tagged expressions, and then show you how to use it in the same line.

But first, XKCD:

An XKCD comic strip where the narrator imagines heroically swooping in to perform RegEx, in order to save a life.

Regular Expressions defined

“A regular expressionregex or regexp [is]… a sequence of characters that define a search pattern. Usually this pattern is then used by string searching algorithms for “find” or “find and replace” operations on strings, or for input validation.” – Wikipedia

In SQL Server Management Studio, we use RegEx to find, or find-and-replace, patterns of text in the code. For example, to find a line that ONLY has the characters “GO” on it, we might search for:

^GO$

The carat (^) means beginning of line, the $ means end of line, and GO is a string literal.

What are tagged expressions?

You can tag a group of expressions and later treat them as one group. So for example, if we wanted to prefix all CREATE TABLE statements in a script with a comment marker (———— tablename ———— ) on the previous line, we could do something like this:

FIND: CREATE TABLE {.+}\(
REPLACE WITH:  ------------ \1 ------------ \nCREATE TABLE \1 (

This changes the code CREATE TABLE dbo.TableB ( … to:

------------ dbo.TableB ------------
CREATE TABLE dbo.TableB ( ...

The brackets {} in the FIND expression say, “treat anything in this grouping as a single entity”. (The .+ says “one or more characters”, and the backslash in front of the parenthesis marks it as a literal character, instead of a RegEx expression).

The \1 in the REPLACE WITH expression refers back to that tagged expression ({.+}). It’s the first (and as it turns out, only) expression we tagged, so we refer to it with \1. (You can think of \1 as an automatic variable.)

If we had tagged two expressions, we’d refer to the second expression with \2. (Note that \n in the REPLACE WITH expression says to insert a line break.)

What’s this about using the tagged expression in the same line?

I had an ordered list of server names that I needed to deduplicate. Of course I could insert them to a table and dedupe using T-SQL, but it’s never a bad idea to learn an alternate method.

I paste the list into SSMS:

SQL1
SQL1
SQL1
SQL3
SQL3
SQL3
...

I then create a Regular Expression to find two consecutive lines that contain the exact same characters (server names), and replace that with a single line containing those characters (server name):

FIND: ^{.+}\n\1$
REPLACE WITH: \1

Let’s break down the FIND statement:

  • ^ means beginning of line
  • {} tags the expression contained within
  • .+ means one or more characters
  • \n means line break
  • and \1 refers to that tagged expression we LITERALLY JUST DEFINED

So in English this would read “find a group of characters that take up an entire line, followed by the exact same group of characters that take up another entire line”. Or to be more succinct (and less accurate), “Find a server name on one line, followed by that same server name”.

THAT’S REALLY COOL. I’m used to referencing a tagged expression in the REPLACE WITH field, but it’s pretty neat to tag an expression and then use it right away.

When I run this find-and-replace a handful of times, the list is deduplicated:

SQL1
SQL3
...

Perhaps I am easily pleased, but that’s not a bad thing. I heartily approve of this RegEx feature.

Bonus: A discussion on limitations

First things first: this particular example only works on an ordered list. If the list of server names were not ordered, the regular expression would not find all of the duplicates, and I’d be better off doing my “dedupe using a temp table and TSQL” method.

Second, all of the duplicate entries must be identical. That’s not a big deal in this example, but let’s say we had a list of duplicated table names? There’s more than one accurate way to represent a table name. For example:

  • dbo.Tbl1
  • dbo.[Tbl1]
  •  dbo.[Tbl1]
  • [dbo].Tbl1
  • [dbo].[Tbl1]
  • Tbl1
  • [Tbl1]

Those are all absolutely duplicates, but RegEx does NOT know that! Because ‘dbo.Tbl1’ <> ‘Tbl1’, of course. And if there are tabs or spaces before or after the table name (or server name), we have the same problem! ‘Svr1 ‘ <> ‘Svr1’. The problem is even worse if you’re on a case-sensitive database. And, this particular problem extends to our (implied) TSQL solution, too!

Third, in the last section I said, “to be more succinct (and less accurate), ‘Find a server name on one line, followed by that same server name'”. It’s less accurate to say it this way, because RegEx doesn’t know we’re talking about server names. This particular find-and-replace code just wants to grab a pair of identical lines of text, and turn them into a single line.

So yes, there are absolutely limits on this…but that’s true of any solution, to any problem. We use the tools we have on hand to get the job done, and RegEx is one beautiful jackhammer of a text manipulation tool!

sqlhelp: limit the x-axis of a geometry line graph

I have a #sqlhelp question that requires some code. So hey, blog post time!

TL;DR: the geometry::STGeomFromText graph provides too wide a range for the X-axis, for a large variance of Y-axis values. Is there a way to limit the X-axis range?

Update: We have an answer! Thanks MJ!

I’m playing with a geometry query based on this article. The query is meant to take inputs from a table, and generate a line graph of amounts over time. Something like this:

overtime

But when I run the following code:

create table #tmp (Yr varchar(4), nm varchar(4))
insert into #tmp (Yr, nm)
values (2010, 10),
(2010, 500),
(2011, 11),
(2012, 200),
(2013, 17),
(2014, 10),
(2015, 20)

DECLARE @WK VARCHAR(8000)
SET @WK =
STUFF(
( SELECT ',' + Yr + ' ' + Nm
FROM #tmp
ORDER BY Yr
FOR XML PATH('')), 1, 1, '');
SELECT geometry::STGeomFromText( 'LINESTRING(' + @WK + ')', 0 );

…I get a graph that is basically a vertical line:

graph

It’s not truly a vertical line. When I zoom in, I can see that this graph actually represents all the numbers…it’s just that there is a VERY LARGE difference between 10 and 500, the minimum and maximum values. And the graph, for some reason, decides that we also need a REALLY WIDE x-axis range (from before 1960 to after 2110). The graph is unreadable.

If I reduce the larger values down, I do get a usable graph…but of course, it doesn’t represent reality:

graph2

So, here’s the “Too Long;Didn’t Read”: The geometry::STGeomFromText graph provides too wide a range for the X-axis, for a large variance of Y-axis values. Is there a way to limit the X-axis range?

Thanks for playing along!