Revisited: Delete From FTP using Azure Data Factory
Due to the demand I have decided to revisit the solution that deletes files from FTP from within a Data Factory pipeline. I believe this approach is a much cleaner design. It is also easier to debug and control.
First make sure, you have read my article on setting up a Key Vault. It is crucial knownledge, as you are going to store FTP credentials there. Set it up and create two secrets – one for login and one for password. If you wish, you may also store your IP address.
Next, create a new Azure Function:
Make sure to pick v2:
Next paste the code into the solution:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
using System; using System.IO; using System.Threading.Tasks; using Microsoft.AspNetCore.Mvc; using Microsoft.Azure.WebJobs; using Microsoft.Azure.WebJobs.Extensions.Http; using Microsoft.AspNetCore.Http; using Microsoft.Extensions.Logging; using Newtonsoft.Json; using System.Collections.Generic; using System.Net; using Microsoft.Azure.Services.AppAuthentication; using Microsoft.Azure.KeyVault; namespace DeleteFromFTP_Function { public static class DeleteFromFTP { [FunctionName("DeleteFromFTP")] public static async Task<IActionResult> Run( [HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)] HttpRequest req, ILogger log) { log.LogInformation("C# HTTP trigger function processed a request."); var azureServiceTokenProvider = new AzureServiceTokenProvider(); var kv = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(azureServiceTokenProvider.KeyVaultTokenCallback)); string Login = kv.GetSecretAsync(Environment.GetEnvironmentVariable("FTPLogin")).Result.Value; string Password = kv.GetSecretAsync(Environment.GetEnvironmentVariable("FTPPassword")).Result.Value; string Path = req.Query["Path"]; string IP = req.Query["IP"]; string requestBody = await new StreamReader(req.Body).ReadToEndAsync(); dynamic data = JsonConvert.DeserializeObject(requestBody); Path = Path ?? data?.Path; IP = IP ?? data?.IP; DeleteFTPDirectory(Path + "/", IP, Login, Password); return (ActionResult)new OkObjectResult(data); } public static List<string> DirectoryListing(string Path, string ServerAdress, string Login, string Password) { FtpWebRequest request = (FtpWebRequest)WebRequest.Create("ftp://" + ServerAdress + Path); request.Credentials = new NetworkCredential(Login, Password); request.Method = WebRequestMethods.Ftp.ListDirectory; FtpWebResponse response = (FtpWebResponse)request.GetResponse(); Stream responseStream = response.GetResponseStream(); StreamReader reader = new StreamReader(responseStream); List<string> result = new List<string>(); while (!reader.EndOfStream) { result.Add(reader.ReadLine()); } reader.Close(); response.Close(); return result; } public static void DeleteFTPFile(string Path, string ServerAdress, string Login, string Password) { FtpWebRequest clsRequest = (System.Net.FtpWebRequest)WebRequest.Create("ftp://" + ServerAdress + Path); clsRequest.Credentials = new System.Net.NetworkCredential(Login, Password); clsRequest.Method = WebRequestMethods.Ftp.DeleteFile; string result = string.Empty; FtpWebResponse response = (FtpWebResponse)clsRequest.GetResponse(); long size = response.ContentLength; Stream datastream = response.GetResponseStream(); StreamReader sr = new StreamReader(datastream); result = sr.ReadToEnd(); sr.Close(); datastream.Close(); response.Close(); } public static void DeleteFTPDirectory(string Path, string ServerAdress, string Login, string Password) { FtpWebRequest clsRequest = (System.Net.FtpWebRequest)WebRequest.Create("ftp://" + ServerAdress + Path); clsRequest.Credentials = new System.Net.NetworkCredential(Login, Password); List<string> filesList = DirectoryListing(Path, ServerAdress, Login, Password); foreach (string file in filesList) { DeleteFTPFile(Path + file, ServerAdress, Login, Password); } } } } |
Notice, that Login and Password are taken from the Key Vault:
1 2 |
string Login = kv.GetSecretAsync(Environment.GetEnvironmentVariable("FTPLogin")).Result.Value; string Password = kv.GetSecretAsync(Environment.GetEnvironmentVariable("FTPPassword")).Result.Value; |
Secret URLs are retrieved from Environmental variables that you can find in application settings:
Under applications settings:
Then, next step is to publish you function. You can do it from within Visual Studio by right-clicking on the project:
In there pick the Azure Function that you want to publish to, or alternatively create a new one. Remember to register you app in Azure Active Directory and add it in access policies in your Key Vault.
After publishing, navigate to Azure Portal and test your function. It expects JSON in this format:
1 2 3 4 5 6 7 |
{ "Path":"<Path to the folder you want to clear>", "IP":"<IP of your FTP>" } |
If it does not work, you can debug it now.
Next step is calling a POST request from Data Factory. To do it, you will use a web activity. However, beforehand you have to copy function URL. To get it navigate to main function page and click </> Get Function URL.
Copy the function from the URL filed:
You are ready to go to your ADF pipeline and create a web activity:
Next you should specify the URL (that you have just copied) and JSON in body with Path and IP – you can even try to set up some dynamic content here:
And that is it! Now you can trigger the function and see if it works.