Get around Active Directory Paging on SSIS import

I have a client who is importing certain users from Active Directory.  The paging on their AD is set to 20,000 records.  When trying to pull data using a SQL statement, the query fails because it hits the maximum number of records and is unable to return more.   You could work around a problem like this by editing your query filter to ensure that you always retrieve fewer than 20,000 records at a time, for example using the whenCreated field.  However, there is no guarantee that whatever filter you use will always limit your return value to a maximum of 20,000 records.  And you now need to build a loop construct to retrieve all the records since you want more than 20,000 records.

This is much easier to solve than you might think, judging from the number of forum questions out there on the subject (and how long it took me to piece it together).   Here are the steps.

Create an SSIS package.

Add a string variable, scoped to the package, called SqlConn.  Populate it with the connection string to the database you want to populate with the AD records.

Add a script task to your package.  Open the script task, making sure that the ScriptLanguage is C# and not VB.

image

Click on the Edit Script button.  On the right hand side you should see the Project Explorer window.  Right click on the name of the Project File at the top of the tree and select Add Reference.

SNAGHTML6315ef17

On the .NET tab scroll down and find System.DirectoryServices. Select it and click OK.

image

Make sure you see the reference appear in the References folder in the Project Explorer window.

image

Add these statements at the beginning of your script.

using System.DirectoryServices;

using System.Data.SqlClient;

Paste this script to replace the public void Main().  Edit the ds.Filter and Insert string values to meet your table requirements.  Be sure to only select single value attributes of the object.   If you try to use this method to import multi-value attributes such as “Description” from AD it won’t work.  I’ll be writing about that next.

public void Main()

{

//Set up the AD connection;

using (DirectorySearcher ds = new DirectorySearcher())

{

//Edit the filter for your purposes;

ds.Filter = “(&(objectClass=user)(|(sAMAccountName=A*)(sAMAccountName=D0*)))”;

ds.SearchScope = SearchScope.Subtree;

ds.PageSize = 1000;

//This will page through the records 1000 at a time;

//Set up SQL Connection

string sSqlConn = Dts.Variables[“SqlConn”].Value.ToString();

SqlConnection sqlConnection1 = new SqlConnection(sSqlConn);

SqlCommand cmd = new SqlCommand();

SqlDataReader reader;

cmd.CommandType = CommandType.Text;

cmd.Connection = sqlConnection1;

//Read all records in AD that meet the search criteria into a Collection

using (SearchResultCollection src = ds.FindAll())

{

//For each record object in the Collection, insert a record into the SQL table

foreach (SearchResult results in src)

{

string sAMAccountName = results.Properties[“sAMAccountName”][0].ToString();

string objectCategory = results.Properties[“objectCategory”][0].ToString();

//Replace any single quotes in the string with two single quotes for sql INSERT statement

objectCategory = objectCategory.Replace(“‘”, “””);

sqlConnection1.Open();

cmd.CommandText = “INSERT INTO Users (sAMAccountName, objectCategory) VALUES (‘” + sAMAccountName + “‘,'” + objectCategory + “‘)”;

reader = cmd.ExecuteReader();

sqlConnection1.Close();

} } } }

 

That’s it.  This will iterate through all of the objects in Active Directory, regardless of paging size set on Active Directory.

To learn how to import multi-value fields from AD, read this post:

How to Query Multi-Value Fields from Active Directory using SSIS

27 thoughts on “Get around Active Directory Paging on SSIS import”

  1. Hi Martina,
    Thank you very much for posting this excellent method for retrieving data from AD and by passing the 1000 records limit.

    I added a few lines of code to retrieve “displayName” also for FullName information and getting the error below:

    Error: Index was out of range. Must be non-negative and less than the size of the collection.
    Parameter name: index

    Could you please advise on how to fix this error message?

    Thank you very much for your time.

    Kevin

    kmikailov@gmail.com

    1. This is what I did to fic the issue.

      if (results.Properties.Contains(“displayName”))

      displayName = results.Properties[“displayName”][0].ToString();
      else
      displayName = “No Full Name”;

      Thanks,
      Kevin

      1. Hi Kevin,

        Thanks for your comments. This will also work.

        if (results.Properties[“EmployeeID”].Count > 0)
        {
        EmployeeID = results.Properties[“EmployeeID”][0].ToString();
        }

        Cheers,
        Martina

        1. Thanks again Martina,
          If I wanted to retrieve all the email addrsses for the users with muitiple email addresses what would be the best approach?
          I am ok to add a new row for each new email address or populate a separate table
          with the email and fullname or username..

          I tried the multi-value attributes method above but it populated table with the same
          email addresses without bringing the additional emails.

          Thanks for your time,
          Kevin

        2. Hi Martina,
          More details on my previous question:
          We have users who have more than one email due to the name changes, etc and they are all actively used. If I check in the outlook property, all the email addresses are listed for that users:
          SMTP: email addrss1, SMTP:email addres2, etc…

          Current process is picking up only the first email address..
          May be we need to specify a proxy email addresses in the code to get all of them
          and load into separate table along with the account name info

          Thanks again,
          Kevin

  2. This is greate stuff. However, I keep getting an error that says “The value for the property PageSize cannot be set”. Here is the snippet of code on how i am setting it. The only difference is that i need to connect to the LDAP server with a username and password. I have tried every type of AuthenticationType with no luck. Any idea? I am expecting over 20K results.

    DirectoryEntry ldap = new DirectoryEntry(“LDAP://uat-corpldap:389/dc=statestreet,dc=com”, “uid=v545830,ou=internal,ou=Users,dc=statestreet,dc=com”, “*****”, AuthenticationTypes.Anonymous);
    using (DirectorySearcher ds = new DirectorySearcher(ldap))
    {
    //Edit the filter for your purposes;
    ds.Filter = “(&(objectclass=person)(sn=*))”;
    ds.SearchScope = SearchScope.Subtree;
    ds.SizeLimit = 0;
    ds.PageSize = 500;
    ds.PropertiesToLoad.Add(“cn”);
    ds.PropertiesToLoad.Add(“c”);

    Please help.

    1. Hi Kamran,
      Do you really need to set the PageSize? Does your code work without setting the page size? I specifically wrote the code to work around Paging, so you shouldn’t need to change it.
      Cheers,
      Martina

  3. Cheers for this, wouldn’t it be better to use the Connection Manager and create a ADO.Net Connection rather then a variable?

    For example

    SqlConnection sqlConnection1 = (SqlConnection)(Dts.Connections[“Destination”].AcquireConnection(Dts.Transaction) as SqlConnection);

  4. Hi Martina,
    sorry for disturbing you again.
    Have you had a chance to read my question about how to retrieve all the email addrsses for the users with muitiple email addresses?

    Thanks,
    Kevin

    1. The script task is something I’m used to reaching for when I need custom work. It has more flexibility than a script component. In this situation a script component might have been enough. Cheers, Martina

  5. Thank you for your VERY useful code…
    I am however getting the following message when executing the Script Task. Unfortunately this means very little to me…. any help/guidance would be appreciated

    DTS Script Tas has encountered an exception in user code:
    …..
    Exception has been thrown by the target of an invocation.


    at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
    at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)
    at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
    at System.RuntimeType.InvokeMember(String name, BindingFlags bindingFlags, Binder binder, Object target, Object[] providedArgs, ParameterModifier[] modifiers, CultureInfo culture, String[] namedParams)
    at Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTATaskScriptingEngine.ExecuteScript()

    1. Hi Meike, Sorry that is a pretty poor error message. Can help you with that. Did you get anywhere with it? Cheers, Martina

    2. Great solution Martina, thanks for sharing.

      Unfortunately, I get the exact same error as Meike. Any assistance would be greatly appreciated.

  6. So I eventually realised the error was connected to the variable, and entirely human error – mine!. I had forgotten to include the SqlConn variable in the Script Task Editor under the read only variables section.

    http://www.vsteamsystemcentral.com/images/ext/SSIS08ScriptTaskEditor.jpg

    Additionally I learned that the SQL data source string cannot include the provider argument. it shoulld look something like this; Data Source=SQLSERVER;Initial Catalog=SQL_DATABASE;Integrated Security=True

  7. Nevermind! I got it figured out, the syntax is VERY precise but I got it working! Thank you for this great knowledge!

  8. Figured it out! I needed to add the ReadOnlyVariable to the Script Task, adoy! Thank you for this great information!

  9. I know this is a really old thread but I was attempting to do the same thing and add displayname.

    //For each record object in the Collection, insert a record into the SQL table

    foreach (SearchResult results in src)

    {

    string sAMAccountName = results.Properties[“sAMAccountName”][0].ToString();

    string employeeID = results.Properties[“employeeID”][0].ToString();

    if (results.Properties.Contains(“displayName”))

    displayName = results.Properties[“displayName”][0].ToString();
    else
    displayName = “No Full Name”;

  10. [ADO NET Source [2]] Error: The ADO NET Source was unable to process the data.
    ‘ADsDSOObject’ failed with no error message available, result code: -2147016669(0x80072023).

  11. hi

    i am trying to load data from active directory to sql table by using adonet source whenever i am running package upto 9000 records are loaded to sql table after 9000 records package was fail and i am getting two errors
    please any one help me how can i solve this two errors

    1.[ADO NET Source [2]] Error: The ADO NET Source was unable to process the data.
    ‘ADsDSOObject’ failed with no error message available, result code: -2147016669(0x80072023).

    2.[SSIS.Pipeline] Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.
    The PrimeOutput method on ADO NET Source returned error code 0xC02090F5.
    The component returned a failure code when the pipeline engine called PrimeOutput().
    The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
    There may be error messages posted before this with more information about the failure.

  12. i have one excel file in that one column my data look like below.

    distinguishedname(column name)
    ————————————————————————————————————-

    CN=MMWTRNG,OU=ITTE US,OU=MAh USERS,DC=co,DC=mah,DC=com
    CN=MM-KND2,OU=Dom Co,DC=co,DC=mah,DC=com
    CN=MAHADMIN,OU=SERVICE IDs,OU=Serv Accounts,DC=co,DC=mah,DC=com
    CN=MMCAS,OU=SERV,ou=appear,ou=set,DC=co,DC=mah,DC=com
    CN=I_USRDKNSA2,OU=SERV IDs,OU=Serv Acc,DC=co,DC=mah,DC=com

    i need to load the data to sql table only ou data.
    what is the best approcah please any one help me.

    output

    ou(column)
    —————–
    1.MAh USERS/ITTE US
    2.Serv Accounts/SERVICE IDs
    3.set/appeear/SERV

  13. Hi Marina,

    This article is very helpful. I need to retrieve fields from an LDAP source which pearl script previously performed. I have two questions 1) regarding data discovery – is there a way to test/run this up to the ds.FindAll())) section just to see the “shape” of the data? That is before any concerns of migration or inserts? I ask because I do not know the fields from the source and want to see all the fields. If there is a way can I do this ad-hoc in VS? 2) once I have better context to the fields do I declare them in the ds.Filter section?

    Thank you in advance for any additional insight

    Vinnie

  14. please provide solution for this i am also facing same issue.

    i am trying to load data from active directory to sql table by using adonet source whenever i am running package upto 9000 records are loaded to sql table after 9000 records package was fail and i am getting two errors
    please any one help me how can i solve this two errors

    1.[ADO NET Source [2]] Error: The ADO NET Source was unable to process the data.
    ‘ADsDSOObject’ failed with no error message available, result code: -2147016669(0x80072023).

    2.[SSIS.Pipeline] Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.
    The PrimeOutput method on ADO NET Source returned error code 0xC02090F5.
    The component returned a failure code when the pipeline engine called PrimeOutput().
    The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
    There may be error messages posted before this with more information about the failure.

Comments are closed.