I'm trying to create a sub-table from another table of all the last name fields sorted A-Z which have a phone number field that isn't null. I could do this pretty easy with SQL, but I have no clue how to go about running a SQL query within Excel. I'm tempted to import the data into postgresql and just query it there, but that seems a little excessive.
How to Embed a SQL Query in Microsoft Excel. This article will help users to embed SQL Query in Excel 2010 and create a dynamic connection in Excel. Go to Data tab and select From Other Sources as shown in the screen shot below.
For what I'm trying to do, the SQL query SELECT lastname, firstname, phonenumber WHERE phonenumber IS NOT NULL ORDER BY lastname would do the trick. It seems too simple for it to be something that Excel can't do natively.
How can I run a SQL query like this from within Excel? There are many fine ways to get this done, which others have already suggestioned.
Following along the 'get Excel data via SQL track', here are some pointers. Excel has the 'Data Connection Wizard' which allows you to import or link from another data source or even within the very same Excel file. As part of Microsoft Office (and OS's) are two providers of interest: the old 'Microsoft.Jet.OLEDB', and the latest 'Microsoft.ACE.OLEDB'. Look for them when setting up a connection (such as with the Data Connection Wizard).
Once connected to an Excel workbook, a worksheet or range is the equivalent of a table or view. The table name of a worksheet is the name of the worksheet with a dollar sign ('$') appended to it, and surrounded with square brackets (' and '); of a range, it is simply the name of the range. To specify an unnamed range of cells as your recordsource, append standard Excel row/column notation to the end of the sheet name in the square brackets. The native SQL will (more or less be) the SQL of Microsoft Access.
(In the past, it was called JET SQL; however Access SQL has evolved, and I believe JET is deprecated old tech.). Example, reading a worksheet: SELECT.
FROM Sheet1$. Example, reading a range: SELECT. FROM MyRange. Example, reading an unnamed range of cells: SELECT.
FROM Sheet1$A1:B10. There are many many many books and web sites available to help you work through the particulars. Further notes By default, it is assumed that the first row of your Excel data source contains column headings that can be used as field names.
If this is not the case, you must turn this setting off, or your first row of data 'disappears' to be used as field names. This is done by adding the optional HDR= setting to the Extended Properties of the connection string. The default, which does not need to be specified, is HDR=Yes. If you do not have column headings, you need to specify HDR=No; the provider names your fields F1, F2, etc.
A caution about specifying worksheets: The provider assumes that your table of data begins with the upper-most, left-most, non-blank cell on the specified worksheet. In other words, your table of data can begin in Row 3, Column C without a problem. However, you cannot, for example, type a worksheeet title above and to the left of the data in cell A1. A caution about specifying ranges: When you specify a worksheet as your recordsource, the provider adds new records below existing records in the worksheet as space allows. When you specify a range (named or unnamed), Jet also adds new records below the existing records in the range as space allows.
However, if you requery on the original range, the resulting recordset does not include the newly added records outside the range. Data types (worth trying) for CREATE TABLE: Short, Long, Single, Double, Currency, DateTime, Bit, Byte, GUID, BigBinary, LongBinary, VarBinary, LongText, VarChar, Decimal. Connecting to 'old tech' Excel (files with the xls extention): Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C: MyFolder MyWorkbook.xls;Extended Properties=Excel 8.0. Use the Excel 5.0 source database type for Microsoft Excel 5.0 and 7.0 (95) workbooks and use the Excel 8.0 source database type for Microsoft Excel 8.0 (97), 9.0 (2000) and 10.0 (2002) workbooks. Connecting to 'latest' Excel (files with the xlsx file extension): Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Excel2007file.xlsx;Extended Properties='Excel 12.0 Xml;HDR=YES;' Treating data as text: IMEX setting treats all data as text. Provider=Microsoft.ACE.OLEDB.12.0;Data Source=Excel2007file.xlsx;Extended Properties='Excel 12.0 Xml;HDR=YES;IMEX=1'; (More details at ) More information at, and at Connecting to Excel via ADODB via VBA detailed at Microsoft JET 4 details at.
@ivanpozdeev I have just confirmed for myself using Excel 2010 that one can query the current file; I don't know whether later editions of Excel/Office make this no longer possible. I would agree that creating self-referential tables via the Data Connection Wizard is clunky - largely because the connection is made using the full path to the workbook, so renaming/copying/moving the workbook would lead to breaking it or confusing results. However, for workbooks where use of VBA isn't an issue, self-referential querying is very manageable. – Aug 20 '18 at 17:51. You can do this natively as follows:. Select the table and use Excel to sort it on Last Name. Create a 2-row by 1-column advanced filter criteria, say in E1 and E2, where E1 is empty and E2 contains the formula =C6=' where C6 is the first data cell of the phone number column.
Select the table and use advanced filter, copy to a range, using the criteria range in E1:E2 and specify where you want to copy the output to If you want to do this programmatically I suggest you use the Macro Recorder to record the above steps and look at the code. If you need to do this once just follow Charles' descriptions, but it is also possible to do this with Excel formulas and helper columns in case you want to make the filter dynamic. Lets assume you data is on the sheet DataSheet and starts in row 2 of the following columns:. A: lastname.
B: firstname. C: phonenumber You need two helper columns on this sheet. D2: =if(A2 = ', 1, 0), this is the filter column, corresponding to your where condition. E2: =if(D2 1, ', sumifs(D$2:D$1048576, A$2:A$1048576, '. If you have compiled with the against the Expat library, you can use the to read.xlsx files, and run SQL expressions from a command prompt.
For example, from a shell in the same directory as the spreadsheet, use the utility: ogrinfo -dialect sqlite -sql 'SELECT name, count(.) FROM sheet1 GROUP BY name' Book1.xlsx will run a query on sheet1, and output the query result in an unusual form: INFO: Open of `Book1.xlsx' using driver `XLSX' successful. Layer name: SELECT Geometry: None Feature Count: 36 Layer SRS WKT: (unknown) name: String (0.0) count(.): Integer (0.0) OGRFeature(SELECT):0 name (String) = Red count(.) (Integer) = 849 OGRFeature(SELECT):1 name (String) = Green count(.) (Integer) = 265. Or run the same query using to make a simple file: $ ogr2ogr -f CSV out.csv -dialect sqlite -sql 'SELECT name, count(.) FROM sheet1 GROUP BY name' Book1.xlsx $ cat out.csv name,count(.) Red,849 Green,265. To do similar with older.xls files, you would need the, built against the FreeXL library, which is not really common (e.g. Not from OSGeo4w).