Big idea is that I need to compute the number of subsidiaries reported in Exhibit 21within Form 10-K as stored in the EDGAR database (always Exhibit 21, mandated by statute). One approach is to utilize the idea that EDGAR contains separate HTM files for exhibits. Here is a link to a Microsoft filing that might help you see the idea:
[login to view URL]
IFFFF we can retrieve the HTM files, then we only need to strip the headers and whitespace, and compute the number of subsidiaries. An alternative approach considers the idea that the complete submission text file contains the exhibits as well. So, this route entails extracting Exhibit 21 data from the text file, then strip and count. Don't think REGEX is the best tool for HTML type data.
Approach is your call.
The final result should generate variables for:
-CIK (central index key).
-period end (conformed period of report, yyymmdd format preferred).
-filing date (date received at SEC, yyymmdd format preferred).
-form type (e.g., 10-K).
-company conformed name.
-the SIC number (standard industrial classification, 4 digits).
-the number of subsidiaries reported in Exhibit 21.
Prefer to use PERL software, with generous documentation to allow me to modify for future projects.
Please let me know if you have any questions.
Thank you!
Thanks for your project.
I am the premier Perl scripting expert on these freelancing sites. I will design a Perl script to emulate a browser accessing one of these filing documents and parse the HTML returned for the data desired. The script will follow the link to the Ex-21 document to count the number of subsidiaries. All the data found shall be output in text format.
A milestone payment for the full budget for this project must be deposited with this site before your offer can be accepted.
Alan Idler
Chief Software Architect
Idleswell Software Creations
Hi,
I have more than 14 years of Data extraction and VBA/Excel exp and I am expert in this kind of work. Let us discuss more to review requirements.
I have completed more than 270 projects. Please look at the feedback left by my employers to know more about my work.
Waiting for your positive response.
Thanks.
Hello, there!
I don't do Perl but I can do it in Python.
I have one question: You gave Microsoft filing as example. How do you intend to feed the script which company to scrape data for?