Apache PDFBox
Developer(s) | Apache Software Foundation |
---|---|
Stable release | |
Repository | PDFBox Repository (Mirror) |
Written in | Java |
Operating system | Cross-platform |
Type | Portable Document Format (PDF) |
License | Apache License 2.0 |
Website | pdfbox |
Apache PDFBox is an open source pure-Java library that can be used to create, render, print, split, merge, alter, verify and extract text and meta-data of PDF files.
Open Hub reports over 11,000 commits (since the start as an Apache project) by 18 contributors representing more than 140,000 lines of code. PDFBox has a well established, mature codebase maintained by an average size development team with increasing year-over-year commits. Using the COCOMO model, it took an estimated 46 person-years of effort.[2]
Structure
Apache PDFBox has these components:
- PDFBox: the main part
- FontBox: handles font information
- XmpBox: handles XMP metadata
- Preflight (optional): checks PDF files for PDF/A-1b conformity.
History
PDFBox was started in 2002 in SourceForge by Ben Litchfield who wanted to be able to extract text of PDF files for Lucene.[3] It became an Apache Incubator project in 2008, and an Apache top level project in 2009.[4]
Preflight was originally named PaDaF and developed by Atos worldline, and donated to the project in 2011.[5]
In February 2015, Apache PDFBox was named an Open Source Partner Organization of the PDF Association.[6]
See also
References
- ^ a b c "Apache PDFBox - Blog". pdfbox.apache.org. Apache Software Foundation. Retrieved 2022-09-27.
- ^ "The Apache PDFBox Open Source Project on Open Hub". openhub.net. 2017-03-18. Retrieved 2017-03-18.
- ^ Apache PDFBox and FontBox 1.0.0 released, The H Open, 16 February 2010
- ^ PDFBox Project Incubation Status
- ^ PaDaF Preflight Codebase Intellectual Property (IP) Clearance Status
- ^ Apache™ PDFBox™ named an Open Source Partner Organization of the PDF Association, February 3, 2015
External links
- Apache PDFBox Project
- v
- t
- e
projects
- Accumulo
- ActiveMQ
- Airavata
- Airflow
- Allura
- Ambari
- Ant
- Aries
- Arrow
- Apache HTTP Server
- APR
- Avro
- Axis
- Axis2
- Beam
- Bloodhound
- Brooklyn
- Calcite
- Camel
- CarbonData
- Cassandra
- Cayenne
- CloudStack
- Cocoon
- Cordova
- CouchDB
- cTAKES
- CXF
- Derby
- Directory
- Drill
- Druid
- Empire-db
- Felix
- Flex
- Flink
- Flume
- FreeMarker
- Geronimo
- Groovy
- Guacamole
- Gump
- Hadoop
- HBase
- Helix
- Hive
- Iceberg
- Ignite
- Impala
- Jackrabbit
- James
- Jena
- JMeter
- Kafka
- Kudu
- Kylin
- Lucene
- Mahout
- Maven
- MINA
- mod_perl
- MyFaces
- Mynewt
- NiFi
- NetBeans
- Nutch
- NuttX
- OFBiz
- Oozie
- OpenEJB
- OpenJPA
- OpenNLP
- OрenOffice
- ORC
- PDFBox
- Parquet
- Phoenix
- POI
- Pig
- Pinot
- Pivot
- Qpid
- Roller
- RocketMQ
- Samza
- Shiro
- SINGA
- Sling
- Solr
- Spark
- Storm
- SpamAssassin
- Struts 1
- Struts 2
- Subversion
- Superset
- SystemDS
- Tapestry
- Thrift
- Tika
- TinkerPop
- Tomcat
- Trafodion
- Traffic Server
- UIMA
- Velocity
- Wicket
- Xalan
- Xerces
- XMLBeans
- Yetus
- ZooKeeper
- Category